Tags: SUM

Determine the most recent status after satisfying certain conditions

{0 Comments}

Assume a three column dataset with Patient ID, Smoking Status and Review Date

PatientID SmokingStatus ReviewDate
P1 10-03-2018
P1 9 09-03-2018
P1 1 08-03-2018
P1 4 07-03-2018
P2 9 10-03-2018
P2 9 09-03-2018
P2 9 08-03-2018
P2 9 07-03-2018
P3 2 10-03-2018
P3 09-03-2018
P3 9 08-03-2018
P4 9 10-03-2018
P4 1 09-03-2018
P4 4 08-03-2018

The objective is the create another 3 column dataset with the following conditions:

  1. If the patient's latest smoking status is other than Blank or 9, then consider that as the smoking status of the patient; and
  2. If the patient's latest smoking status is blank or 9, then consider the previous smoking status that is not blank or 9; and
  3. If the patient's smoking status is blank or 9 on all dates, then consider the smoking status as 9

The expected result is:

PatientID Last date when the smoking status was other than 9 or Blank Smoking status on that date
P1 08-Mar-18 1
P2 10-Mar-18 9
P3 10-Mar-18 2
P4 09-Mar-18 1

I have solved this question using 3 methods - PowerPivot, Advanced Filters and formulas.  You may download my solution workbook from here.

Compute the average of values against the 5 most recent dates of each Category

{0 Comments}

Here is a simple 3 column dataset showing Categories, Date and Value

Catagorie Date Value
Fish 08-12-2015 6
Crab 05-12-2015 7
Crab 04-12-2015 6
Bird 27-11-2015 4
Snow 25-11-2015 10
Cat 21-11-2015 7
Dog 12-11-2015 5
Dog 28-10-2015 5
Fish 12-10-2015 3
Bird 11-10-2015 9
Dog 22-09-2015 9
Crab 17-08-2015 8
Cat 11-08-2015 1
Fish 20-07-2015 5
Crab 03-07-2015 2
Fish 02-06-2015 8
Fish 26-05-2015 9
Dog 14-05-2015 4
Snow 07-05-2015 7
Bird 03-05-2015 9
Cat 20-04-2015 10
Cat 15-04-2015 2
Snow 13-04-2015 3
Crab 29-03-2015 5
Crab 23-03-2015 2
Bird 14-03-2015 5
Cat 14-03-2015 1
Dog 26-02-2015 9
Fish 16-02-2015 4
Fish 08-02-2015 6
Bird 18-01-2015 1
Snow 06-01-2015 10

The objective is to compute category wise average of values against the 5 most recent dates.  So the expected result is:

Categorie Category wise average values on 5 most recent dates
Bird 5.60
Cat 4.20
Crab 5.60
Dog 6.40
Fish 6.20
Snow 7.50

So for the Category of Fish (dates are sorted in descending order), the average should be 6.2 (average of 6,3,5,8,9)

Catagorie Date Value
Fish 08-12-2015 6
Fish 12-10-2015 3
Fish 20-07-2015 5
Fish 02-06-2015 8
Fish 26-05-2015 9
Fish 16-02-2015 4
Fish 08-02-2015 6

I have solved this problem using the PowerPivot. You may download my solution workbook from here.

In a Pivot Table, compute highest revenue earned on any day from each customer and the date thereof

{0 Comments}

Here's a simple dataset showing the Date of sale, Customer Name and Sales amount.

Date Customer Name Sales amount
12-03-2017 A 1
12-03-2017 A 2
12-03-2017 A 3
12-03-2017 B 4
12-03-2017 B 5
12-03-2017 B 6
12-03-2017 B 7
12-03-2017 B 8
13-03-2017 A 1
13-03-2017 A 1

The objective is to determine Customer wise:

  1. Highest revenue; and
  2. Date on which that highest revenue was earned

The expected result is

Row Labels Highest revenue Date of highest revenue
A 6 12-Mar-2017
B 30 12-Mar-2017
Grand Total 36

So, for A, revenue earned on March 12, 2017 is 6 and on March 13, 2017 is 2.  Since the higher of the two is 6, that is the result which should appear in the Pivot Table.

I have solved this problem with the PowerPivot.  You may download my solution workbook from here.

Filter a column of a Pivot Table on a certain condition but also show other items from that column

{0 Comments}

The title sounds confusing!!!!.  Please bear with me and read on.  Here's a simple dataset

Client ID Client Name Resource Project ID Billable amount
1 Alpha David 1000 10
1 Alpha Henry 1001 20
1 Alpha Rakesh 1002 30
1 Alpha Alice 1003 40
2 Beta Alice 1000 50
2 Beta Alicia 1002 60
2 Beta Patrick 1003 70
2 Beta Mukesh 1004 80
2 Beta Suresh 1006 90
2 Beta Ajay 1005 100
3 Gamma Rama 1004 110
3 Gamma Sakshi 1006 120
4 Theta Prabhu 1005 130
5 Epsilon Alice 1000 140
5 Epsilon Alicia 1001 150
5 Epsilon Prabhu 1002 160
5 Epsilon Sakshi 1003 170
5 Epsilon Raghav 1008 180
5 Epsilon David 1010 190
5 Epsilon Henry 1012 200

Here's a Pivot Table built from the dataset above.

untitled

The question is "Is there a way to show only those rows of data which have Alice but also show others who worked with Alice".  While the first part of the question can be answered easily by filtering the Resource column on Alice, the second part (italicized for your reference) of the question is the real challenge.  When one filters the Resource column on Alice, the result is as seen below:

untitled1

This view does not show me who else worked with Alice.  The result I am expecting to see is:

untitled2

This problem can be resolved with the help of the Query Editor (Power Query).  The basic idea is to create another column in the original dataset where we create a string of all resources for every row.  So for example, in every row of Client ID1, the sixth column should show David,Henry,Rakesh.Alice and so on.  Once this is done, one can simply take this column to the Report filter section of the Pivot Table (see last image above) and filter on Alice.

You may refer to my solution in this workbook.

Distribute projected revenue annually

{2 Comments}

Here is a dataset showing Project wise forecast of open opportunities.

  1. Topic is the Project Name
  2. Est. Close Date is the date by when the opportunity would be closed i.e. the project would be won from that Client
  3. Duration is the time (in months) for which the project would run
  4. Amount is the total amount that would be billed for that project

Clients are invoiced annually only. So in the example below:

  1. Project ABC is for US$1 million with a duration of 24 months and is expected to be closed in Oct. 2017.  We need to model the data to show the billing every 12 months.  So for ABC US$500K would be billed in Oct-2017 and another US$500K in Oct-2018.
  2. Project GEF is for US$2 million with a duration of 18 months and is expected to be closed in Feb. 2018. We need to model the data to show US$1.3 million in Feb-2018 and another US$666K in Feb-2019.  The monthly billing is US$2 million divided by 18 and then multiplied by 12 - this amounts to US$1.3 million.
Topic Est. Close Date Duration (Months) Amount
ABC 01-10-2017 24 1,000,000
GEF 01-02-2018 18 2,000,000
XYZ 01-03-2018 30 1,000,000

The expected result should look like this:

Row Labels Oct-17 Feb-18 Mar-18 Oct-18 Feb-19 Mar-19 Mar-20 Total
ABC 500,000 500,000 1,000,000
GEF 1,333,333 666,667 2,000,000
XYZ 400,000 400,000 200,000 1,000,000
Grand Total 500,000 1,333,333 400,000 500,000 666,667 400,000 200,000 4,000,000

I have solved this problem using Power Query and PowerPivot. You may download my solution workbook from here.

Compute standard hours spent on weekdays by Tier, Week, Month and Country

{0 Comments}

Imagine a Sales dataset with the following columns - ID, Country, Start date, End date and Tier. Here's a snapshot of the table:

ID Country Start date End date Tier
33948 ES 25-Sep-17 28-Sep-17 3
19820 US 08-Oct-17 17-Oct-17 4
9118 US 27-Oct-17 03-Nov-17 2
1563 ES 02-Sep-17 07-Sep-17 2
11087 US 18-Oct-17 27-Oct-17 1
15057 US 05-Nov-17 13-Nov-17 4
13567 ES 18-Oct-17 26-Oct-17 4
22362 DE 28-Aug-17 30-Aug-17 4
31301 FR 04-Nov-17 10-Nov-17 2
22838 US 05-Sep-17 13-Sep-17 4

There is another 2 column table which lists down the Tier wise standard hours. Here's a snapshot of the table:

Tier Daily load hh
1 0.7
2 0.72
3 0.8
4 1.2

The result which one expects from these two tables is shown below:

Tier
Month Week 1 2 3 4 Grand total   Slicer
9 35 x x x x x US
36 x x x x x UK
37 x x x x x FR
38 x x x x x IT
39 x x x x x ES
Grand total   x x x x x    

Let's take a specific example to show the result expected for the US Region. From the last row of the Sales Table, one can see that there is a ID 22838 in US which spans the date range 05-Sept-17 to 13-Sep-17. These dates span week numbers 36 (from 05-Sept-17 to 08-Sep-17) and 37 (from 11-Sept-17 to 13-Sep-17) [The week numbers have been obtained by using Excel's WEEKNUM() function]. So, is US, for Tier 4, in Week 36, the standard hours worked on weekdays would be 4.8 (4 days * 1.2 hours per day). Likewise, in week 37, the standard hours worked on weekdays would be 3.6 (3 days * 1.2 hours).

I have solved this problem using Power Query a.k.a Get & Transform (Available under Data) in Excel 2016 and PowerPivot.  You may download my solution workbook from here.

Sum the largest 5 of the last 10 numbers in a row ignoring blanks

{2 Comments}

Assume a single row of data with numbers and blanks appearing at random intervals.  The objective is to sum the largest 5 of last 10 numbers in that row.  Solving this problem entails multiple steps:

  1. Identify the last 10 numbers in that row i.e. starting from the right hand side, identify the last 10 numbers
  2. Identify the largest 5 of those 10 numbers
  3. Sum those largest 5 numbers

Here are the steps

  1. Suppose the numbers and blanks are in range A2:V2
  2. Type 10 in cell X1
  3. Enter this array formula (Ctrl+Shift+Enter) in cell X2

=SUM(SMALL(IF((SUBTOTAL(2,OFFSET(V2,,,1,(COLUMN($A2:$V2)-COLUMN(W2))))<=X$1)*($A2:$V2)=0,FALSE,(SUBTOTAL(2,OFFSET(V2,,,1,(COLUMN($A2:$V2)-COLUMN(W2))))<=X$1)*($A2:$V2)),{1,2,3,4,5}))

Compute an average for the same day in the past 3 years

{4 Comments}

Assume a simple two column dataset with dates in column A and numbers in column B. The dates in column A are from January 1, 2013 to December 31, 2016 and numbers in column B are for the period January 1, 2013 to December 31, 2015 (there are no numbers for January 1, 2016 to December 31, 2016).

The objective is to "Compute an average for each day of calendar year 2016. The average should be for the occurrence of that day in the previous 3 years". Here's an example:

1. January 1, 2016 was a Friday (the first Friday of 2016) and is in cell A1097
2. In cell B1097, the average should be computed as: Average of the "First Friday of each of the previous 3 years"
3. January 8, 2016 was a Friday (the second Friday of 2016) and is in cell A1104
4. In cell B1104, the average should be computed as: Average of the "Second Friday of each of the previous 3 years"

I have solved this problem with the help of the PowerPivot. You may refer to my solution in this workbook.

Visualising data flows using Custom Visuals

{0 Comments}

Assume a 4 column dataset (a small sample) as follows:

City of Origin City of destination Mode of Transport Passengers travelled
New Delhi Pune Air 123
New Delhi Mumbai Air 213
New Delhi Kolkata Air 125
Chandigarh Jammu Bus 785
Chandigarh Amritsar Train 567

Given this dataset, one may want answers to the following questions:

1. Of all those passengers who originated their journey (City of Origin) from Chandigarh, how many terminated their journey (City of destination) in New Delhi via different modes of transport; and
2. Of all those passengers who terminated their journey (City of destination) in Jammu, how many arrived in Amritsar (City of Origin) via different modes of transport; and
3. Of all those passengers who travelled by Bus, how many travelled from City A (City of Origin) to City X,Y,Z (City of destination)

While one can analyse/slice and dice this data using Pivot Tables, one cannot visualize this data very clearly (even after creating a Pivot chart).  I have attempted to visualize this data using a software called PowerBI desktop (a free for download and use Business Intelligence software from Microsoft which rolls all of Excel's BI tools into 1 - PowerPivot, Power Query, Power Map and Power View).

You may download the source Excel workbook and the Power BI desktop workbook from this link.

You may also watch a short video here:

 

Compute product wise YTD Revenue from a matrix like/Cross tabular dataset

{0 Comments}

Assume the following data layout

Name Budget April Actual April Budget May Actual May Budget June Actual June
a 2 1 1 4 3
b 4 4 2 1
c 2 3 3 3
d 2 1
e 5 6 4 8 6

As one can observe here, there are two sub columns for each month - Budget and Actual.  From this data layout, we wish to compute the Product wise YTD budget and Actual sales figure.  So for example, if one selects May in a drop down, then the YTD budget for Product B should be 6 and YTD actual for Product B should be 5.

The expected solution should look like this

Month Name May
Products YTD Budget YTD Actual
a 2 2
b 6 5
c 2 3
e 9 14
Grand Total 19 24

You may download my solution workbook from this link.

You may watch a short video of my solution here