Tags: EARLIER

Compute transaction fee based on a tiered pricing model

{0 Comments}

Consider a simple dataset as shown below:
untitled
For each tier, the tier rate is incrementally applied to the volume within the tier volume range.  Given the following transaction volumes, one may want to compute the transaction fee
untitled1
The expected result is shown below
untitled2

As one can observe, for a transaction value of 400,000, the fee has been computed as 3% on the first 100,000 and 2.5% on the next 300,000. You may download my solution workbook from here. In the file, I have shared 2 solutions - a conventional formula based one and a PowerPivot solution.

I have also solved a similar question here.

Compute an average for the same day in the past 3 years

{4 Comments}

Assume a simple two column dataset with dates in column A and numbers in column B. The dates in column A are from January 1, 2013 to December 31, 2016 and numbers in column B are for the period January 1, 2013 to December 31, 2015 (there are no numbers for January 1, 2016 to December 31, 2016).

The objective is to "Compute an average for each day of calendar year 2016. The average should be for the occurrence of that day in the previous 3 years". Here's an example:

1. January 1, 2016 was a Friday (the first Friday of 2016) and is in cell A1097
2. In cell B1097, the average should be computed as: Average of the "First Friday of each of the previous 3 years"
3. January 8, 2016 was a Friday (the second Friday of 2016) and is in cell A1104
4. In cell B1104, the average should be computed as: Average of the "Second Friday of each of the previous 3 years"

I have solved this problem with the help of the PowerPivot. You may refer to my solution in this workbook.

Identify buy and sell break points

{0 Comments}

Assume a two column dataset with Date in the first column and Price in the second one.  The purpose is to identify times to buy and sell - buying would be just after the lowest low is confirmed and sell before or just after the highest high is in place. Confirmation is achieved through crossover of moving averages. This data is being used in back testing buy and sell criteria.

Snapshot of base data

Snapshot of expected result

The Lowest Low is the lowest price that occurs before the next Highest High.  The Highest High is the highest price that occurs before the next Lowest Low..  2.77 is the lowest low after the highest high of 3.69 and 3.23 is highest high after the lowest low of 2.77.

You may refer to my solution in this workbook.

Compute year on year growth in a Pivot Table

{0 Comments}

Assume a three column dataset which has Year, Company ID and Cash flows.  For each Company, there are cash flows for multiple years.  So for Company ID A001, there are 7 rows, one each for 2010 to 2004 and cash flows appearing in a third column.  Let's assume the number of rows are 750,000.

The task is to compute the year on year growth rate in a fourth column.  While this problem can easily be solved by writing a formula in a fourth column, copying that formula all the way down to 750,000 rows will be time consuming and processor intensive.

I have been able to solve this problem using PowerPivot.  You may download the workbook from here.

You may refer to related questions at this link

1. Computing growth % inside a pivot table
2. Compute Pro rata growth rate within a Pivot Table

Sales data modelling and interactive visualisations

{30 Comments}

Visualise Sales Data of a Non-Alcoholic Beverage Company with basic columnar information such as Date of Sale, Time of Sale, Brand, Stock Keeping Unit (SKU), State, City, Quantity sold, Unit Price and Salesman Code.  In this sales dataset, each line item represents one visit for one SKU.  If nothing is sold in a certain visit, then the SKU column displays No Sale.  So effectively there is a line item for each visit whether or not something is sold in that visit.

From this simple Sales dataset, here are a few questions which one may need to find answers to:

1. How did the Company perform (in both years 2013 and 2014) on two of the most critical Key Performance Indicators (KPI's) - Quantity sold and Number of Visits.  Also, what is the month wise break up of these two KPI's.

2. Study and slice the two KPI's from various perspectives such as "Type of Outlet visited", "Type of Visit" - Scheduled or Unscheduled, "Day of week", "Brand", "Sub brand".

3. Over a period of time, how did various SKU's fair on the twin planks of "Effort" i.e. Number of visits YTD and "Business Generated" i.e. Quantity sold YTD.

4. Analyse the performance of the Company on both KPI's:
a. During Festive season/Promotional periods/Events; and
b. During different months of the same year; and
c. During same month of different years; and
d. Quarter to Date

5. "Complimentary Product sold Analysis" - Analysis displayed on online retailers such as Amazon.com - "Customers who bought this also bought this".  So in the Sales dataset referred to above, one may want to know "In this month, outlets which bought this SKU, also bought this much quantity of these other SKU's."

6. "Outlet Rank slippage" - Which are the Top 10 Outlets in 2013 and what rank did they maintain in 2014.  What is the proportion of quantity sold by each of the Top 10 outlets of 2013 to:
a. Total quantity sold by all Top 10 outlets in 2013; and
b. Total quantity sold by all outlets in 2013

7. In any selected month, which new outlets did the Company forge partnerships with

8. Which employees visited their assigned outlets once in two or three weeks instead of visiting them once every week (as required by Management).

9. Which outlets were not visited at all in a particular month

10. Business generated from loyal Customers - Loyal Customers are those who transacted with the Company in a chosen month and in the previous 2 months.

These are only a few of my favourite questions which I needed answers to when I first reviewed this Sales Data.  Using Microsoft Excel's Business Intelligence Tools (Power Query, PowerPivot and Power View), I could answer all questions stated above and a lot more.

You may watch a short video of my solution here

Data slicing and analysis with the Power Pivot

{0 Comments}

Visualise an MS Excel file with two worksheets:

  1. Employee headcount – a multi column dataset with information such as Employee code, Date of Joining, Age, Division, Department and Location.  Each row represents data for one employee.  The number of rows on this worksheet is approximately 700.
  2. Training Data - a multi column dataset with information such as Employee code, Training Date from, Training Date to, Training Program Name, Training Program Category (Internal and External), Training Location and Training Service Provider.  Each row represents one training attended by one employee.  The number of rows on this worksheet is approximately 2,600.

Let’s suppose that the training calendar of this company runs from July to June.  Some questions (only few mentioned for illustration purposes) which a Training Manager may need answers to are:

1)   How may unique employees were trained each year; and
a)   Of the unique employees trained, how many were first time trainees and how many were repeat trainees
i)   Of the first time trainees:
(1)    How many joined this year
(2)    How many joined in past years
ii)  Of the first time trainees:
(1)    How many were trained within the first year of joining
(2)    How many were trained in the second year of joining
(3)    How many were trained in the third year of joining
(4)    How many were trained after three years of joining
iii)  Of the repeat trainees:
(1)    What is the average gap (in days) between trainings
(2)    What is the minimum gap (in days) between trainings
(3)    What is the maximum gap (in days) between trainings

Getting answers to the questions mentioned above would entail writing a lot of lookup related formulas, applying filters, copying and pasting and then creating Pivot Tables.  While the example taken above is that of a training database, you may envision “drilling down to and slicing” any dataset – Marketing, Sales, Purchase etc.

You may watch a short video of my solution here

In these two workbooks, you will be able to see the level to which one can drill down and analyse data using the Power Pivot add-in.  When you open this workbook, please go the first worksheet and make the relevant choice of MS Excel version first so that you start looking at the Analysis from the correct worksheet.

1. Analysing Training data of a company; and
2. Analysing Sales data of a company

You will be able to see the analysis in these workbooks only if you are using one of the following versions of MS Office:

1. Excel 2013 Professional Plus; or
2. Excel 2010 with the Power Pivot add-in installed.  Power Pivot is a free add-in from Microsoft which can be downloaded from here.

Lastly, if you are using the Power Pivot add-in in Excel 2010, you will not be able to see the underlying Data Model or the calculated Field formulas because this workbook has been created in Excel 2013 Professional Plus and unfortunately the Power Pivot model is not backward compatible.  However, all the analysis performed in this workbook can be performed in Excel 2010 as well (with the Power Pivot add-in installed).

Story telling with Excel Power BI

{6 Comments}

With Power Business Intelligence (BI) tools of Excel 2013, one can metamorphose raw data and/or results of complex calculations into stunning and interactive visualizations.  Power View (one of the four components of Power BI) allows one to create a PPT like flow in Excel thus allowing one to weave a story.  To be able to interact with/create visualizations, you will need to install Microsoft Office Professional Plus 2013 (this version will already have two of the four components of Power BI - PowerPivot and Power View).  Additionally, you will have to install the following add-ins from Microsoft (the other two components of Power BI)

1. Power Query; and
2. Power Map

I have tried to showcase the prowess of Power BI tools of Excel 2013 in these two workbooks:

1. An overview of the BRIC Economies
2. Sales data analysis

You may watch a video of my work at this link

Compute configuration count using Set Theory and Venn Diagrams

{0 Comments}

Assume a table which lists attendees for a Company's Annual day function.  In this Table, data for every attendee is shown on a separate row so if an employee attends the function with his/her spouse and three children, then there will be 5 rows for that employee.

The question is to determine the count of the following family configuration:

1. Employees only (those who attended without spouse and children); and
2. Employees, spouse and children (Family); and
3. Employees and spouse (no children); and
4. Employees and children (no spouse)

You may refer to my solution in this workbook.  I have solved this problem using:

1. MS Excel Formulas based on Set Theory and Venn Diagram; and
2. PowerPivot

Analysing customer walkin data by date and service taken

{5 Comments}

Assume data is in range A3:E10.  In A4:A10, various service types are mentioned.  In B3:E3, dates are mentioned from June 1, 2012 to June 4, 2012.  In range B4:E10 are numbers representing number of customers.  One may want to answer the following questions from this data:

1. For every date, total number of customer walkins and total number of services taken; and
2. For every date, new customer walkins and new services taken; and
3. For every date, repeat customer walkins and repeat services taken

While the first and third questions are fairly straight forward to solve, some deliberation would be required for the second question.  A new service type taken on June 3, 2012 would be one that has not been taken by any customer from June 1 - 2, 2012.  So if cell A8 has Service type E and cell D8 (data for June 3, 2012) has 3 (3 customer took service type E on June 3, 2012), then this service should be counted only if there is no figure in range B8:C8 i.e. no customer took this service on June 1 - June 2, 2012.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in and the PowerPivot add-in, then a few simple steps and minimal DAX formulas can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).

Solution for all versions of MS Excel

While this solution works for all versions of MS Excel, it uses an array formula (Ctrl+Shift+Enter).  Array formulas, if used extensively in the workbook, adversely effect the system's performance.

You may refer to my solution in the this workbook.

Remove duplicates after satisfying additional conditions

{3 Comments}

Assume a two column database of patient ID's and service availed.  One patient may avail the same service multiple times in a year due to which that record may appear as many times as the service is availed.  For e.g., if patient A001 avails the Radiology service twice, then A001 and Radiology will appear in two rows.

Once may want to create the following two reports from this database:

1. A list containing all those records where the patient availed just one service; and
2. A list containing all those records where the patient availed more than one service

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the PowerPivot add-in, then a calculated column formula can resolve this problem.

Solution for all versions of MS Excel

I have shared two solutions here:

1. Array formula and advanced filters; and
2. Only array formulas

For better understanding of the question and to view the final solution, please refer to this workbook.