Knowledge Base

Tabulating data from multiple unstructured Excel files

{ 0 Comments }

By Ashish Mathur · February 22, 2020 · POWER QUERY · Leave a comment

Many a times data downloaded from Applications/ERP’s are not in a filter/Pivot ready format. In such cases, a lot of time has to first be invested in getting that data in proper order before even beginning to analyse that data. What makes this situation worse is that data is downloaded every month in that unstructured […]

Read More →

Calculate rolling sum for the past week by ignoring blank cells

{ 0 Comments }

By Ashish Mathur · February 21, 2020 · PowerBI desktop, POWERPIVOT · Leave a comment

Assume a simple dataset as shown in the image below (the input data is in columns A and B only. The desired outcome is in columns C and D). The objective is to calculate the 7 days rolling sum and average (as shown in columns C and D) ignoring blank cells. So in cell C8, […]

Read More →

Append data from multiple worksheets of multiple workbooks where each worksheet has a different heading

{ 0 Comments }

By Ashish Mathur · August 26, 2019 · POWER QUERY · Leave a comment

In a folder there are multiple workbooks with an unknown number of worksheets in each workbook. Each worksheet has data for one year and has 13 columns – the first is for the Product and the other 12 are for each month of the year. So sheet1 of Book1 has Product in column1, 1 Jan […]

Read More →

Summarise data by most recent status

{ 0 Comments }

By Ashish Mathur · July 2, 2019 · POWER QUERY + POWERPIVOT, PowerBI desktop · Leave a comment

Here’s a simple 3 column dataset showing Date, ID and Status – the status of each ID by Date. So, the narrative for ID A is: It was “New” on Jan 1 It remained “New” until Jan 14 On Jan 15, the status changed to “Open” It remained “Open” till Jan 31 and the status […]

Read More →

Segment customers into dynamic buckets

{ 0 Comments }

By Ashish Mathur · June 29, 2019 · PowerBI desktop, POWERPIVOT · Leave a comment

Consider a 4 column table – Respondent ID, Device ID, App Name and Category. So this dataset shows which apps are installed on which device ID by which user and which category do the apps fall into. It is a small dataset with only 4 columns and 2,000 rows. The question on this dataset is […]

Read More →

Compute hours spent on projects given resource allocation

{ 2 Comments }

By Ashish Mathur · May 18, 2019 · CALCULATION, POWER QUERY + POWERPIVOT · Leave a comment

In the dataset below column A has the Employee Name, column B and C are the assignment start and end dates, Column D is the location and columns E to J are the Month-Year columns. So each row represents data for an employee on a particular project. The numbers in range E2:J8 represent how much […]

Read More →

Customer analysis by Country and time period

{ 2 Comments }

By Ashish Mathur · January 26, 2019 · PowerBI desktop · Leave a comment

Here is a Sales dataset of 8 columns and 29 rows. It basically details the revenue earned and cash collected by service type, Customer, Country and Period. For a selected Country and time period, there could be customers availing of both services or of any 1 service. There are 2 broad questions that one may […]

Read More →

Compute Relative Size Factor per vendor

{ 2 Comments }

By Ashish Mathur · January 26, 2019 · POWERPIVOT · Leave a comment

Relative size factor (RSF) is a test to identify anomalies where the largest amount for subsets in a given key is outside the norm for those subsets. This test compares the top two amounts for each subset and calculates the RSF for each. In order to identify potential fraudulent activities in invoice payment data, one […]

Read More →

Analyse free flowing text data or user entered remarks from multiple perspectives

{ 0 Comments }

By Ashish Mathur · January 19, 2019 · POWER QUERY + POWERPIVOT · Leave a comment

Here is a 2 column dataset – UserID in column A and Remarks in Column B. This dataset basically tabulates the remarks/comments shared by different users. Entries in the Remarks column are basically free flowing text entries which have the following inconsistencies/nuances: Users reported multiple errors which are separated by comma, Alt+Enter (same line within […]

Read More →

Determine the top selling location for each product

{ 0 Comments }

By Ashish Mathur · January 18, 2019 · CALCULATION, FILTERS, POWER QUERY, POWERPIVOT · Leave a comment

Visualise a 3 column dataset as shown below – Location, Product and Sales. Each location can have multiple products (Product A has Banana, Apple and Carrot) and each product can be sold in multiple locations (Banana is sold in locations A, B and F). The objective is to determine the location with highest sales for […]

Read More →