Category: POWER QUERY + POWERPIVOT

Determine latest condition of each equipment and show a month wise count

{0 Comments}

There are 100 machines in a factory.  Every machine has different test frequency. In a given month, not every machine is tested but we still have the last known rating (from some previous month) of that machine.  We have to show the latest rating of each machine for each month in a stacked column chart. This way, the total number will remain 100 every month in the chart, but the rating distribution (color based on legend) will change based on last available rating of that machine.

For example, in January, 35 machines were tested. So we have latest ratings of these 35 machines. But as the rest of the machines also have some previous rating, the graph needs to show all 100, with last available rating.

The expected result should look like this

You may download my PBI Desktop file from here.  The very same DAX formulas can be written in the DAX formula language of MS Excel as well.

Show Balance outstanding everyday even if data for everyday is not available

{0 Comments}

In this simple 3 column dataset, there are 2 accounts - Konto 1 and Konto 2.  Each account has a balance outstanding as on a certain date.  However, if you notice carefully, there is no balance for any account on January 4-5,9-10 2020.

The objective is show the balance outstanding every day.  For days which are absent from the dataset, the balance outstanding should be the balance as on the previous day.  So for the Konto 1 account, on January 4-5, the balance should be 400 and on January 9-10, it should be 250.  The same logic applies for the Konto 2 account as well.  The expected result should be

I have solved this problem using Data > Get & Transform and PowerPivot.  You may download my solution workbook from here.

Here's another related question.  Given this 3 column dataset, the objective is to determine the total of the "PDV Combined Total" column for the past "5 billing days" from today - the past 5 working days have to be determined by looking at the "IsABillingDay" column.  So if today is December 4, 2020, then the past 5 business days would be November 30, 2020 to December 4, 2020.  The result should be 14,95,937.  However, if today was December 2, 2020, then the past 5 business days would be November 24, 2020 to December 2, 2020.  The result would be 14,33,545.  You may download the solution in a PBI file from here.  So while this question has been solved using the DAX formula language in PowerBI Desktop, since the same formula language exists in MS Excel as well, this result can be obtained in MS Excel as well.

Summarise data by most recent status

{0 Comments}

Here's a simple 3 column dataset showing Date, ID and Status - the status of each ID by Date.

So, the narrative for ID A is:

  1. It was "New" on Jan 1
  2. It remained "New" until Jan 14
  3. On Jan 15, the status changed to "Open"
  4. It remained "Open" till Jan 31 and the status changed to "Closed" on Feb 1
  5. It remained "Closed" till March 31 and the status changed to "Stop" on April 1
  6. It has remained in 'Stop" status till Today

Note that for the month of March, there is no record for ID A but the status of it has to be treated as Closed (refer point 5 above).

The objective is to count the number of ID's by status and month.  The expected result is:

Please note that the trick part here is to get the result as 2 in cell D6 (Status closed for March). I have solved this problem using Power Query and PowerPivot.  Since these two Business Intelligence (BI) tools are available in PowerBI desktop (PBI) as well, you may download a folder with both files (the MS Excel workbook and PBI file) from here.

Compute hours spent on projects given resource allocation

{2 Comments}

In the dataset below column A has the Employee Name, column B and C are the assignment start and end dates, Column D is the location and columns E to J are the Month-Year columns.  So each row represents data for an employee on a particular project.  The numbers in range E2:J8 represent how much that particular employee is aligned to the particular project i.e. a value of 1 means that the employee is dedicated solely to that project, 1.4 means that the employee will be spending extra hours on that project and 0.1 indicates that the employee will be working on multiple other projects.

The objective is to create another column (column K in the second screenshot) which will show the number of hours the employee will spend on the project.  The number of hours will be computed as number of working days in a month (treat Saturday and Sunday as weekends) * time allocation to that project (the numbers in range E2:J8) * 8.5 hours per day for an Offshore project and 8 hours per day for other projects.

The raw data sheet looks like this

The expected result is

The figure in cell K3 has been computed as:

  • Number of working days between November 11, 2018 and November 30, 2018 are 15.  So 15 * 1 = 15
  • Number of working days between December 1, 2018 and December 12, 2018 are 8.  So 8 * 0.5 = 4
  • Total effective working days are 15 + 4 = 19
  • Since it is an Offshore project, the hours per day would be 8.5.  Therefore total effective hours: 19 * 8.5 = 161.5

I have solved this problem using 3 methods:

  1. Excel formulas - Refer worksheet named "Formula output"
  2. Power Query and PowerPivot - Refer worksheet named "Power Pivot output"
  3. Power Query only - Refer worksheet named "Power Query output"

You may download my solution workbook from here.

Analyse free flowing text data or user entered remarks from multiple perspectives

{0 Comments}

Here is a 2 column dataset - UserID in column A and Remarks in Column B.  This dataset basically tabulates the remarks/comments shared by different users.  Entries in the Remarks column are basically free flowing text entries which have the following inconsistencies/nuances:

  1. Users reported multiple errors which are separated by comma, Alt+Enter (same line within the cell) and numbered bullets
  2. Users committed spelling mistakes (see arrows in Table1)
  3. A user ID may be repeated in column A

Given this dataset, one may want to "hunt" for specific "keyword Groups" (column E above) in each user remark cell and get meaningful insights.  Some questions which one would like to have answers to are:

  1. How may users reported each type of keyword Group - "How may users used the Unresponsive keyword?".  See Pivot Table1 below.
  2. Which are the keyword Groups that each user reported - "Which are the different keyword groups reported by UserID A004?".  See Pivot Table2 below.
  3. How many users reported each of the different keyword Groups - "How many users reported all 3 problems of Slow, unresponsiveness and crash".  See Pivot Table 3 below.
  4. How may users who used this keyword group also used this keyword group - "How many users who reported Crash also reported Unresponsive?".  See Pivot Table 4 below.

This was quite a formidable challenge to solve because of spelling mistakes and multiple keywords reported in each cell.  I have solved this problem with the help of Power Query and PowerPivot.  You may download my workbook from here.

Show Project wise status in a Pivot Table

{0 Comments}

Visualise a simple 6 column Table as shown below - Project Name and the finish date for each of the 5 stages that the projects go through.  Each project goes through 5 stages - Requirement (Req), Development (Dev), UAT, Implement and Warranty.

The objective is to report on the status of each project at the end of each month based on which stage is/was completed in that month.  So, if a given project's requirements are completed in January and development completes some time in March, the one would expect the output of the report to show the project's status in January and February as "Req" and in March as "Dev" respectively.  February should also show "Req" because the next stage was completed only in March (although it may have started in January).  If multiple stages complete in one month, then the report should display only the most recently completed stage.  So, if Project A completed both Requirements and Development stages in January, the report should show only "Dev" as the stage completed in January.

For the data shared above, the expected result is:

You may download my solution workbook from here.

Determine the lowest bidding vendor(s) for each product in a Pivot Table

{0 Comments}

Imagine a dataset like this.  This dataset shows vendors that submitted proposals for supplying various parts to a Company.  There is one column for each of the twelve months.

untitled

Via a simple Pivot Table, one can determine the lowest bidding vendor per product (part) for any chosen month.  However, one may also want to know the names of those vendors for each product (as seen in column G below).  Notice, that Vendor 2 and Vendor 3 submitted the lowest bid for Product 1 and therefore both names should appear in the result.

untitled

I have solved this problem using PowerPivot and Power Query a.k.a. Data > Get & Transform in Excel 2016.  You may download my solution workbook from here.

Show sales only for corresponding months in prior years

{2 Comments}

Refer to this simple Sales dataset

untitled

The objective is to create a simple matrix with months in the row labels, years in the column labels and sales figures in the value area section.  The twist in the question is that for years prior to the current year (2018 in this dataset), sales should only appear till the month for which there is data for the current year.  For e.g., for 2018, data is only till Month 4 and therefore for prior years as well, data should only appear till Month 4.  As and when Sales data gets added below row 17, data for prior years should also go up to that month.

The expected result is

untitled1

You may download my PBI file from here. The same solution can be obtained in Excel as well (using Power Query and PowerPivot).

Filtering on 2 date fields within one Table

{0 Comments}

This table contains a list of all the inspections created and completed within different time periods.

untitled

The objective is to create two Tables from this single table - one showing the Accounts created within the chosen time period and another showing the those that were closed within the same time period.  Here are screenshots of the expected results.

untitled1 untitled2 untitled3 untitled4

You may download my PowerBI desktop solution workbook from here.  The same solution can be obtained in Excel as well (using Power Query and PowerPivot).

Determine number of learners who have completed different stages of multiple online courses

{4 Comments}

Here is a sample dataset of learners who have cleared different stages of multiple courses on offer within an Organisation:

Learner Stage completed Course
Bill Stage 1 Public Speaking
Bill Stage 2 Public Speaking
Bill Stage 3 Public Speaking
Susan Stage 1 Effective Communication
Bob Stage 1 Public Speaking
Bob Stage 2 Public Speaking
Sheila Stage 1 Effective Communication
Sheila Stage 2 Effective Communication
Sheila Stage 3 Effective Communication
Frank Stage 1 Effective Communication
Frank Stage 2 Effective Communication
Henry Stage 1 Public Speaking
Henry Stage 2 Public Speaking
Bill Stage 1 Effective Communication
Bill Stage 2 Effective Communication

From this sample dataset, one may want to know how many participants have completed each stage of these multiple courses.  The expected result is shown below:

Row Labels Stage 1 Stage 2 Stage 3
Effective Communication 1 2 1
Public Speaking 2 1
Grand Total 1 3 2

In this workbook, I have shared 2 solutions - one using formulas and the other using the Power Query & PowerPivot.