Learner |
Stage completed |
Course |

Bill | Stage 1 | Public Speaking |

Bill | Stage 2 | Public Speaking |

Bill | Stage 3 | Public Speaking |

Susan | Stage 1 | Effective Communication |

Bob | Stage 1 | Public Speaking |

Bob | Stage 2 | Public Speaking |

Sheila | Stage 1 | Effective Communication |

Sheila | Stage 2 | Effective Communication |

Sheila | Stage 3 | Effective Communication |

Frank | Stage 1 | Effective Communication |

Frank | Stage 2 | Effective Communication |

Henry | Stage 1 | Public Speaking |

Henry | Stage 2 | Public Speaking |

Bill | Stage 1 | Effective Communication |

Bill | Stage 2 | Effective Communication |

From this sample dataset, one may want to know how many participants have completed each stage of these multiple courses. The expected result is shown below:

Row Labels |
Stage 1 |
Stage 2 |
Stage 3 |

Effective Communication | 1 | 2 | 1 |

Public Speaking | 2 | 1 | |

Grand Total | 1 | 3 | 2 |

In this workbook, I have shared 2 solutions - one using formulas and the other using the Power Query & PowerPivot.

]]>Model | Length (MM) | Wide (MM) | Thk (MM) | CAT |

HX9-G-ARD | 1071 | 273 | 3.5 | A |

MYP-G-3RD | 580 | 535 | 3.2 | B |

EPO-G-3RD | 580 | 535 | 3.2 | A |

MYG-G-3R | 966 | 350 | 3.2 | A |

MYN-G-3RD | 649 | 530 | 3.2 | A |

GM SPIN-G-3FD | 882 | 395 | 3.2 | A |

MY8-G-AR | 880 | 400 | 3.5 | B |

GM2-G-AR | 880 | 400 | 3.5 | A |

From this inventory data, one has to furnish customer orders based on specific dimensions demanded by them. A typical Customer request would be to supply glass sheets as per the following dimensions

Length (MM) | Wide (MM) | Thk (MM) | CAT |

780 | 542 | 3.5 | A |

The firm may or may not have glass sheets of this specific size. The objective is to identify glass sheets, from the inventory on hand, which match customer specifications. If there is no exact match, then one must be able to obtain all inventory items which have the **same** Thk (MM) and CAT as the customer specified dimensions but the Length and thickness should be **more than equal to** the customer specified dimensions. The length and width can then be trimmed to match the exact customer dimensions. Furthermore, the result returned should:

- List only the Top 30 glass sheets available in inventory; and
- List those Top 30 glass sheets in ascending order of wastage (wastage caused when the glass sheet is trimmed to match the customer specified dimensions)

You may refer to my solution in this workbook. I have shared two solutions - one using Excel formulas and the other using Power Query a.k.a. Get and Transform in Excel 2016. Please read the Comments in cells F1, J9 and J16 of the "Solutions" worksheet. The difference between the 2 solutions is:

**Formula driven solution****-**This is in range J10:AM14 of the Solutions worksheet. This is a semi dynamic solution (as compared to the Power Query solution). To get the models in ascending order of wastage, one will have to create an Area column in the base data and sort that column in ascending order.**Power Query solution**- This is in range J17:AM21 of the Solutions worksheet. This is a dynamic solution. Just change the customer specified dimensions in range G2:J2 of the Data and Query worksheet. Thereafter just right click on any cell in the range below and select refresh.

Week |
Team |
User |
Codes |

1009-1016 | Default-LossMit FAST | INTL\KOrdillano | ATPD/5 |

1009-1016 | Default-LossMit FAST | INTL\KOrdillano | ATWI/116 |

1009-1016 | Default-LossMit FAST | INTL\KOrdillano | ATWI/3B |

1009-1016 | Default-LossMit FAST | INTL\ADulnuan | ATWI/116 |

1009-1016 | Direct - HSD | INTL\JCustodioii | S/2 |

1009-1016 | Default-LossMit FAST | INTL\abacud | ATWI/116 |

1009-1016 | Default-LossMit FAST | INTL\SCaparon | ATWI/116 |

1009-1016 | Default-LossMit FAST | INTL\ADulnuan | ATWI/116 |

1009-1016 | Default-LossMit FAST | INTL\ADulnuan | ATWI/116 |

A simple Pivot Table (with a slicer) created from this dataset looks like this

The objective is to determine the Top 3 users of each week for each slicer selection. Unfortunately, there is no way to sort multiple columns of a Pivot Table all at once. Once may either sort by the Grand Total column or by the individual week wise columns. Since we do not want to sort by the Grand Total column, the only way out is to sort the individual week wise columns. The expected result should look like this:

I have solved this problem by using CUBE formulas. You may refer to my solution in this workbook.

]]>Project Name |
Task1 |
Task2 |
Task3 |
Task4 |
Task5 |
Task6 |

Project1 |
Painting | Chef | Gardener | |||

Project2 |
Tiling | Digging | Engineering | |||

Project3 |
Mechanic | Engineering |

Here is a competency matrix showing the competencies of employees on different tasks. 1 indicates that the employee is competent to perform that task.

Task |
Tom |
Jane |
Mary |
Paddy |
Lynda |

Painting |
1 | 1 | 1 | 1 | 1 |

Tiling |
1 | 1 | 1 | 1 | 1 |

Plastering |
1 | 1 | 1 | 1 | 1 |

Digging |
1 | 0 | 1 | 1 | 1 |

Mechanic |
1 | 1 | 1 | 0 | 1 |

Detective |
1 | 1 | 1 | 1 | 1 |

Engineering |
1 | 1 | 0 | 1 | 1 |

Boxer |
1 | 0 | 1 | 1 | 1 |

Chef |
1 | 1 | 1 | 1 | 1 |

Gardener |
1 | 1 | 0 | 1 | 1 |

Banker |
1 | 1 | 1 | 1 | 0 |

From these two tables, one may want to generate another table showing which employees can be assigned to which project (only those employees should be assigned to a project who can complete all tasks). So the ideal solution is to create another column (8th column) in the Project matrix table above which should have a drop down (Data > Data Validation) for every project showing which employees are competent for that project.

**Here's an illustration**:

Assuming that the Project matrix is in range A1:G4 (headers are in row 1)

- In cell H2 (for Project1), the drop down should show Jane, Lynda, Paddy and Tom. Mary should not appear there because she cannot perform one of the 3 tasks required to complete the project i.e. Gardener.
- In cell H3 (for Project2), the drop down should show Lynda, Paddy and Tom. Jane and Mary should not appear there because they cannot perform the Digging and Engineering tasks respectively.

The solution is dynamic for the following:

- Projects added to the Project matrix Table; and
- Tasks added (upto 6 only) or edited in the Project matric Table; and
- Employees added to the Competency matrix Table; and
- Tasks added to the Competency matrix Table

I have solved this problem by using:

- Power Query; and
- Formulas in Data > Data Validation.

- Identify the last 10 numbers in that row i.e. starting from the right hand side, identify the last 10 numbers
- Identify the largest 5 of those 10 numbers
- Sum those largest 5 numbers

Here are the steps

- Suppose the numbers and blanks are in range A2:V2
- Type 10 in cell X1
- Enter this array formula (Ctrl+Shift+Enter) in cell X2

=SUM(SMALL(IF((SUBTOTAL(2,OFFSET(V2,,,1,(COLUMN($A2:$V2)-COLUMN(W2))))<=X$1)*($A2:$V2)=0,FALSE,(SUBTOTAL(2,OFFSET(V2,,,1,(COLUMN($A2:$V2)-COLUMN(W2))))<=X$1)*($A2:$V2)),{1,2,3,4,5}))

]]>After analysing the data, I have also visualised that data using PowerView. From the link shared above, you may download the workbook, watch the YouTube video and see a PowerBI desktop custom visual ("Sankey Diagram"). In this post, I have taken the same dataset and showcased/discussed the following:

1. How one can discover insights from this data with minimal effort using a Custom PowerBI desktop visual called "Sand Dance"; and

2. How one can query the dataset using "Natural Language" on a web browser (using www.powerbi.com); and

3. How one can query the dataset using "Natural Language" using Cortana (Microsoft's personal digital assistant in Windows 10).

For aspects 2 and 3 above, here are a few "Natural Language queries" which returned the correct result:

1. Show total revenue and growth in total revenue over previous month where order status is delivered by month in ascending order of month order as a Table

2. Show total revenue by category as a column chart

3. Show total revenue by order period as a pie chart in descending order of total revenue

4. Show total revenue by order period as a pie chart in descending order of total revenue where day of week is Sunday

5. Show Business generated from new categories by month where order period is mid day, payment type is COD sorted by month order in ascending order as a table

6. Show total revenue where portion of month is first half of month

Enough talking!!. You may view all three aspects mentioned above in this YouTube video

You may download the Powerbi desktop workbook from here and play around with the Sand Dance visual yourself. The PowerBI.com service also allows one to Publish reports to the Web (which can be viewed and interacted with by anyone). This is currently in preview stage and may become a payable service later. You may view and interact with the Sand Dance visual here:

]]>text | Value |

A | 1 |

B | 2 |

C | 3 |

D | 4 |

E | 5 |

F | 6 |

G | 7 |

H | 8 |

I | 9 |

J | 0 |

The objective is to generate the numeric code for text code of any length entered in a certain cell. For example, a user will type a certain text code, say ABEJ and the expected result should be 1250. For JABF, the result should be 0126. The text entry and text length are both user determined.

With ABEJ, typed in cell D2, enter this array formula in cell E2

=TEXT(SUMPRODUCT((LOOKUP(MID(D2,ROW(INDIRECT("1:"&LEN(D2))),1),$A$2:$A$11,$B$2:$B$11))*((10^(LEN(D2)-1-(ROW(INDIRECT("1:"&LEN(D2)))-1))))),REPT("0",LEN(D2)))

This formula can now be copied down for generating the numeric code for all text codes entered in column D.

]]>The objective is to "Compute an average for each day of calendar year 2016. The average should be for the occurrence of that day in the previous 3 years". Here's an example:

1. January 1, 2016 was a Friday (the first Friday of 2016) and is in cell A1097

2. In cell B1097, the average should be computed as: Average of the "First Friday of each of the previous 3 years"

3. January 8, 2016 was a Friday (the second Friday of 2016) and is in cell A1104

4. In cell B1104, the average should be computed as: Average of the "Second Friday of each of the previous 3 years"

I have solved this problem with the help of the PowerPivot. You may refer to my solution in this workbook.

]]>City of Origin |
City of destination |
Mode of Transport |
Passengers travelled |

New Delhi | Pune | Air | 123 |

New Delhi | Mumbai | Air | 213 |

New Delhi | Kolkata | Air | 125 |

Chandigarh | Jammu | Bus | 785 |

Chandigarh | Amritsar | Train | 567 |

Given this dataset, one may want answers to the following questions:

1. Of all those passengers who originated their journey (City of Origin) from Chandigarh, how many terminated their journey (City of destination) in New Delhi via different modes of transport; and

2. Of all those passengers who terminated their journey (City of destination) in Jammu, how many arrived in Amritsar (City of Origin) via different modes of transport; and

3. Of all those passengers who travelled by Bus, how many travelled from City A (City of Origin) to City X,Y,Z (City of destination)

While one can analyse/slice and dice this data using Pivot Tables, one cannot visualize this data very clearly (even after creating a Pivot chart). I have attempted to visualize this data using a software called PowerBI desktop (a free for download and use Business Intelligence software from Microsoft which rolls all of Excel's BI tools into 1 - PowerPivot, Power Query, Power Map and Power View).

You may download the source Excel workbook and the Power BI desktop workbook from this link.

You may also watch a short video here:

]]>

1. Order Date/Time

2. City to which orders were shipped

3. Order Number

4. Payment Type i.e. Cash on delivery, Net Banking, EMI's

5. Order Status i.e. Delivered or cancelled

6. SKU's which the ordered items fall into

7. Products which the ordered SKU's fall into

8. Categories which the ordered products fall into

Given this simple tabular representation, one may want to analyse and visualize this dataset from multiple perspectives based on user selections, such as

"What was the **revenue earned** from the **Top 5** **products** in the **A100 category** in **April** for orders shipped to **New Delhi**?"

In this query framed above, the end user should have the leeway to select any/all of the underlined facets. So one can either choose revenue earned or Number of orders. Likewise, one can either select Top 5 products or Top 15 products/Top 5 SKU's etc.

With relative ease, one should also be able to "Perform an affiliate analysis" showing which categories are ordered together (to study affiliations). Please review this post for an independent discussion on "Affinity Analysis".

Furthermore, one should be able to perform a free form timeline search such as - "I would like to study growth in Total revenue of March 2-8 2015 over Feb 1-4 2015"

You may download the workbook from the link shared above.

You may watch similar videos showcasing the capabilities of Business Intelligence in MS Excel:

1. Analyse Sales data of a Beverage Company

2. Analyse Training data of a Company

Here's a video showing the capabilities of this Sales data model

You may also watch this short video to see how I visualized the revenue flow from Categories to Shipping cities during different Order periods using Custom visuals available in PowerBI desktop.

Please feel free to download the PowerBI desktop workbook of the video shown above from here.

For a detailed overview of Sankey diagrams (a Custom visual available in PowerBI desktop), you may refer to my Blog article here.

Another great Custom visual (Sand Dance) which allows data discovery has been shown at this link. At that link, you will also be able to see how I queried the underlying dataset using "Natural Language".

]]>