Author: Ashish Mathur

Rearrange a multi heading dataset into a single heading one which is Pivot ready

{0 Comments}

Any well arranged dataset should be "Pivot Table" ready with the following 3 important properties:

  1. There should be no merged and centered cells; and
  2. Every column should have a unique heading; and
  3. Every column should have only 1 heading

Here's one dataset which violates all rules mentioned above.

  1. Headings in row 1 are merged; and
  2. The headings in row 2 are not unique
  3. Every column has headings in row 1 and row 2.

To be Pivot Table friendly, this dataset will have to be restructured into a 4 column one - Week, Equipment, Jobsite and Widgets as shown below:

I have achieved the desired result by using Data > Get & Transform (also known as Power Query in earlier versions of MS Excel).  The solution is dynamic for new rows and columns added to the data on the Input worksheet - one simple has to go to Data > Refresh All.  You may download my solution workbook from here.

Rearrange travel data to clearly show travel from and travel to locations

{0 Comments}

Consider a 2 column dataset as shown below

User Location
AAA Tokyo
AAA Osaka
AAA Nagoya
AAA Hakone
AAA Kyoto
BBB Sapporo
BBB Nara
CCC Tokyo
CCC Hakone
CCC Osaka
DDD Osaka
DDD Tokyo

Customer AAA travelled from Tokyo to Osaka, Osaka to Nagoya, Nagoya to Hakone and Hakone to Kyoto.  All locations appear in a single column.  To analyse customer travel information very clearly, one may want to rearrange the dataset as follows:

User From To
AAA TOKYO OSAKA
AAA OSAKA NAGOYA
AAA NAGOYA HAKONE
AAA HAKONE KYOTO
BBB SAPPORO NARA
CCC TOKYO HAKONE
CCC HAKONE OSAKA
DDD OSAKA TOKYO

I have solved this problem using Power Query. You may download my solution workbook from here.

Identify Customers that Organisations can upsell or cross sell their products to

{0 Comments}

Here's a simple Sales data of a retail Store which sells Apple Products.  Since a customer can transact multiple times, there will be repetitions in the Cust ID column.  While Cust ID 123 and 782 purchased multiple products from the same Store in one transaction, Cust ID 53 purchased multiple products from different stores (Store ID 165 and 45) [and therefore the Order ID's are also different (Order ID 2 and 6)].

On this small sample, one may want to identify "Cross and up selling opportunities" i.e. one may want to know which are the Customers that can be approached for selling more products to.  So for e.g. one may want to know which Customers have bought only one product so far.  A case in point being the Apple Watch - Customer ID 2442 and 428 bought only this product.  The other Customers who bought the Apple Watch also bought atleast one more product.  Therefore, Cust ID 2442 and 428 could be approached for buying other products as well.

Solving this via conventional Excel formulas and Pivot Tables would prove to be a formidable challenge.  I have solved this problem using a Data visualisation software from Microsoft called PowerBI desktop (it can be downloaded free from the Microsoft website).  This problem can also be solved in MS Excel using Power Query and Power Pivot.

There are 3 sections in the image below - Table at the top (First Table), slicers at the right and another Table at the bottom (Second Table).

Interpretation of First Table

  1. 1 appearing at the intersection of APPLE TV (row labels) and APPLE TV (column labels) represents that there is 1 customer who bought the APPLE TV
    1. 1 appeaing at the intersection of APPLE TV (row labels) and MACBOOK AIR (column labels) represents that the 1 customer who bought the APPLE TV also bought the MACBOOK AIR
  2. 4 appearing at the intersection of APPLE WATCH (row labels) and APPLE WATCH (column labels) represents that there are 4 customers who bought the APPLE WATCH
    1. 1 appearing in other columns of the same row represents other products which those customers bought
    2. When one right click's on APPLE WATCH and selects "Drill down", one will be able to see the Customers who bought the other products as well.
      1. Customer 53 bought the APPLE WATCH, AIRPORT and IPHONE 8S.  Customer 123 bought APPLE WATCH, IPHONE X and IPOD
      2. Customers 2442 and 428 did not buy any other product

Interpretation of Second Table

This table shows a list of Customers (and their transaction details) who bought only and only that one product selected by the user in the filter section (see the red oval selection in the image).  So these two customers could be approached for selling more products to.

You may download my PBI desktop file from here.

Determine the lowest bidding vendor(s) for each product in a Pivot Table

{0 Comments}

Imagine a dataset like this.  This dataset shows vendors that submitted proposals for supplying various parts to a Company.  There is one column for each of the twelve months.

untitled

Via a simple Pivot Table, one can determine the lowest bidding vendor per product (part) for any chosen month.  However, one may also want to know the names of those vendors for each product (as seen in column G below).  Notice, that Vendor 2 and Vendor 3 submitted the lowest bid for Product 1 and therefore both names should appear in the result.

untitled

I have solved this problem using PowerPivot and Power Query a.k.a. Data > Get & Transform in Excel 2016.  You may download my solution workbook from here.

Sort, comma separated entries appearing in a cell, in ascending order

{5 Comments}

Assume a single column dataset as shown below.  As you can observe, there are multiple entries appearing the same cell separated by commas.

untitled

The objective is to sort, in ascending order, the entries in each cell.  The expected result is shown below.

untitled

I have solved this problem using Power Query a.k.a Data > Get & Transform in Excel 2016.  You may download my solution workbook from here.

Search for multiple phrases within a cell and extract all those phrases in another column

{0 Comments}

Assume a simple two column dataset as shown below.  In column A are some text strings.  In each cell are multiple phrases separated by commas.  In column C are some phrases - one in each cell.  The objective is to search for all phrases in each cell and extract the phrases found in another column of the same row.

untitled

The expected result is:

untitled

I have solved this problem using Power Query a.k.a. Data > Get & Transform in Excel 2016.  You may download my solution workbook from here.

Show sales only for corresponding months in prior years

{0 Comments}

Refer to this simple Sales dataset

untitled

The objective is to create a simple matrix with months in the row labels, years in the column labels and sales figures in the value area section.  The twist in the question is that for years prior to the current year (2018 in this dataset), sales should only appear till the month for which there is data for the current year.  For e.g., for 2018, data is only till Month 4 and therefore for prior years as well, data should only appear till Month 4.  As and when Sales data gets added below row 17, data for prior years should also go up to that month.

The expected result is

untitled1

You may download my PBI file from here. The same solution can be obtained in Excel as well (using Power Query and PowerPivot).

Filtering on 2 date fields within one Table

{0 Comments}

This table contains a list of all the inspections created and completed within different time periods.

untitled

The objective is to create two Tables from this single table - one showing the Accounts created within the chosen time period and another showing the those that were closed within the same time period.  Here are screenshots of the expected results.

untitled1 untitled2 untitled3 untitled4

You may download my PowerBI desktop solution workbook from here.  The same solution can be obtained in Excel as well (using Power Query and PowerPivot).

Compute transaction fee based on a tiered pricing model

{0 Comments}

Consider a simple dataset as shown below:
untitled
For each tier, the tier rate is incrementally applied to the volume within the tier volume range.  Given the following transaction volumes, one may want to compute the transaction fee
untitled1
The expected result is shown below
untitled2

As one can observe, for a transaction value of 400,000, the fee has been computed as 3% on the first 100,000 and 2.5% on the next 300,000. You may download my solution workbook from here. In the file, I have shared 2 solutions - a conventional formula based one and a PowerPivot solution.

I have also solved a similar question here.

Determine the most recent status after satisfying certain conditions

{0 Comments}

Assume a three column dataset with Patient ID, Smoking Status and Review Date

PatientID SmokingStatus ReviewDate
P1 10-03-2018
P1 9 09-03-2018
P1 1 08-03-2018
P1 4 07-03-2018
P2 9 10-03-2018
P2 9 09-03-2018
P2 9 08-03-2018
P2 9 07-03-2018
P3 2 10-03-2018
P3 09-03-2018
P3 9 08-03-2018
P4 9 10-03-2018
P4 1 09-03-2018
P4 4 08-03-2018

The objective is the create another 3 column dataset with the following conditions:

  1. If the patient's latest smoking status is other than Blank or 9, then consider that as the smoking status of the patient; and
  2. If the patient's latest smoking status is blank or 9, then consider the previous smoking status that is not blank or 9; and
  3. If the patient's smoking status is blank or 9 on all dates, then consider the smoking status as 9

The expected result is:

PatientID Last date when the smoking status was other than 9 or Blank Smoking status on that date
P1 08-Mar-18 1
P2 10-Mar-18 9
P3 10-Mar-18 2
P4 09-Mar-18 1

I have solved this question using 3 methods - PowerPivot, Advanced Filters and formulas.  You may download my solution workbook from here.