# Tags: MAXX

Assume a three column dataset with Patient ID, Smoking Status and Review Date

 PatientID SmokingStatus ReviewDate P1 10-03-2018 P1 9 09-03-2018 P1 1 08-03-2018 P1 4 07-03-2018 P2 9 10-03-2018 P2 9 09-03-2018 P2 9 08-03-2018 P2 9 07-03-2018 P3 2 10-03-2018 P3 09-03-2018 P3 9 08-03-2018 P4 9 10-03-2018 P4 1 09-03-2018 P4 4 08-03-2018

The objective is the create another 3 column dataset with the following conditions:

1. If the patient's latest smoking status is other than Blank or 9, then consider that as the smoking status of the patient; and
2. If the patient's latest smoking status is blank or 9, then consider the previous smoking status that is not blank or 9; and
3. If the patient's smoking status is blank or 9 on all dates, then consider the smoking status as 9

The expected result is:

 PatientID Last date when the smoking status was other than 9 or Blank Smoking status on that date P1 08-Mar-18 1 P2 10-Mar-18 9 P3 10-Mar-18 2 P4 09-Mar-18 1

I have solved this question using 3 methods - PowerPivot, Advanced Filters and formulas.  You may download my solution workbook from here.

Here's a simple dataset showing the Date of sale, Customer Name and Sales amount.

 Date Customer Name Sales amount 12-03-2017 A 1 12-03-2017 A 2 12-03-2017 A 3 12-03-2017 B 4 12-03-2017 B 5 12-03-2017 B 6 12-03-2017 B 7 12-03-2017 B 8 13-03-2017 A 1 13-03-2017 A 1

The objective is to determine Customer wise:

1. Highest revenue; and
2. Date on which that highest revenue was earned

The expected result is

 Row Labels Highest revenue Date of highest revenue A 6 12-Mar-2017 B 30 12-Mar-2017 Grand Total 36

So, for A, revenue earned on March 12, 2017 is 6 and on March 13, 2017 is 2.  Since the higher of the two is 6, that is the result which should appear in the Pivot Table.

I have solved this problem with the PowerPivot.  You may download my solution workbook from here.

Here's a simple two column dataset

 Comment Identifier Intervals A 3pm-6pm A 9pm-12pm S 3pm-6pm S 3pm-6pm S 9pm-12pm A 9pm-12pm S 9pm-12pm D 3pm-6pm A 9pm-12pm A 9pm-12pm A 9pm-12pm A 3pm-6pm A 3pm-6pm

For identifiers listed in column A, there are time intervals in column B. Note that for a certain identifier, a time interval can appear multiple times. The objective is two-fold:

1. For each identifier, show the time interval which appears most frequently; and
2. For each identifier, compute the count of the time interval which appears most frequently

The expected result is:

As you can observe, the result for S is two time periods - 3pm-6pm and 9pm-12pm.  This is because each of them appears twice.

Assume a simple three column dataset showing hours worked by different machine on different dates.  So column A is Date, column B is Machine Name and column C is hours worked.  There are duplicates appearing in column A and B .  Blanks in column C depict machine idle time.

The task is to create a simple three column dataset showing all unique Machine names in the first column, Last day on which the machine worked in the second column and hours worked on the last day in the third column.

This problem can be solved by using formulas (Refer first worksheet of the workbook) but if one has to use a Pivot Table, then there would be a few problems.

1. The Grand Total for the Date Field should be blank because on cannot determine the Last day on which the machine worked across different machine types.  A conventional Pivot Table shows the Maximum of all dates appearing in the Date Field.

2. The Grand Total for the Hours worked Field should be a summation of the total hours worked on last day across all machine types.  A conventional Pivot Table shows the Maximum of all hours worked appearing in the Hours worked Field.

3. The biggest problem of them all is that there is no way to give a criteria as the Last day for that machine for computing another Field in the Pivot Table.  Please refer the file for a better understanding.

This problem can be solved using the PowerPivot.  You may refer to my solution in this workbook.

Visualise Sales Data of a Non-Alcoholic Beverage Company with basic columnar information such as Date of Sale, Time of Sale, Brand, Stock Keeping Unit (SKU), State, City, Quantity sold, Unit Price and Salesman Code.  In this sales dataset, each line item represents one visit for one SKU.  If nothing is sold in a certain visit, then the SKU column displays No Sale.  So effectively there is a line item for each visit whether or not something is sold in that visit.

From this simple Sales dataset, here are a few questions which one may need to find answers to:

1. How did the Company perform (in both years 2013 and 2014) on two of the most critical Key Performance Indicators (KPI's) - Quantity sold and Number of Visits.  Also, what is the month wise break up of these two KPI's.

2. Study and slice the two KPI's from various perspectives such as "Type of Outlet visited", "Type of Visit" - Scheduled or Unscheduled, "Day of week", "Brand", "Sub brand".

3. Over a period of time, how did various SKU's fair on the twin planks of "Effort" i.e. Number of visits YTD and "Business Generated" i.e. Quantity sold YTD.

4. Analyse the performance of the Company on both KPI's:
a. During Festive season/Promotional periods/Events; and
b. During different months of the same year; and
c. During same month of different years; and
d. Quarter to Date

5. "Complimentary Product sold Analysis" - Analysis displayed on online retailers such as Amazon.com - "Customers who bought this also bought this".  So in the Sales dataset referred to above, one may want to know "In this month, outlets which bought this SKU, also bought this much quantity of these other SKU's."

6. "Outlet Rank slippage" - Which are the Top 10 Outlets in 2013 and what rank did they maintain in 2014.  What is the proportion of quantity sold by each of the Top 10 outlets of 2013 to:
a. Total quantity sold by all Top 10 outlets in 2013; and
b. Total quantity sold by all outlets in 2013

7. In any selected month, which new outlets did the Company forge partnerships with

8. Which employees visited their assigned outlets once in two or three weeks instead of visiting them once every week (as required by Management).

9. Which outlets were not visited at all in a particular month

10. Business generated from loyal Customers - Loyal Customers are those who transacted with the Company in a chosen month and in the previous 2 months.

These are only a few of my favourite questions which I needed answers to when I first reviewed this Sales Data.  Using Microsoft Excel's Business Intelligence Tools (Power Query, PowerPivot and Power View), I could answer all questions stated above and a lot more.

You may watch a short video of my solution here

Visualise an MS Excel file with two worksheets:

1. Employee headcount – a multi column dataset with information such as Employee code, Date of Joining, Age, Division, Department and Location.  Each row represents data for one employee.  The number of rows on this worksheet is approximately 700.
2. Training Data - a multi column dataset with information such as Employee code, Training Date from, Training Date to, Training Program Name, Training Program Category (Internal and External), Training Location and Training Service Provider.  Each row represents one training attended by one employee.  The number of rows on this worksheet is approximately 2,600.

Let’s suppose that the training calendar of this company runs from July to June.  Some questions (only few mentioned for illustration purposes) which a Training Manager may need answers to are:

1)   How may unique employees were trained each year; and
a)   Of the unique employees trained, how many were first time trainees and how many were repeat trainees
i)   Of the first time trainees:
(1)    How many joined this year
(2)    How many joined in past years
ii)  Of the first time trainees:
(1)    How many were trained within the first year of joining
(2)    How many were trained in the second year of joining
(3)    How many were trained in the third year of joining
(4)    How many were trained after three years of joining
iii)  Of the repeat trainees:
(1)    What is the average gap (in days) between trainings
(2)    What is the minimum gap (in days) between trainings
(3)    What is the maximum gap (in days) between trainings

Getting answers to the questions mentioned above would entail writing a lot of lookup related formulas, applying filters, copying and pasting and then creating Pivot Tables.  While the example taken above is that of a training database, you may envision “drilling down to and slicing” any dataset – Marketing, Sales, Purchase etc.

You may watch a short video of my solution here

In these two workbooks, you will be able to see the level to which one can drill down and analyse data using the Power Pivot add-in.  When you open this workbook, please go the first worksheet and make the relevant choice of MS Excel version first so that you start looking at the Analysis from the correct worksheet.

You will be able to see the analysis in these workbooks only if you are using one of the following versions of MS Office:

1. Excel 2013 Professional Plus; or
2. Excel 2010 with the Power Pivot add-in installed.  Power Pivot is a free add-in from Microsoft which can be downloaded from here.

Lastly, if you are using the Power Pivot add-in in Excel 2010, you will not be able to see the underlying Data Model or the calculated Field formulas because this workbook has been created in Excel 2013 Professional Plus and unfortunately the Power Pivot model is not backward compatible.  However, all the analysis performed in this workbook can be performed in Excel 2010 as well (with the Power Pivot add-in installed).