# Category: CALCULATION

In the dataset below column A has the Employee Name, column B and C are the assignment start and end dates, Column D is the location and columns E to J are the Month-Year columns.  So each row represents data for an employee on a particular project.  The numbers in range E2:J8 represent how much that particular employee is aligned to the particular project i.e. a value of 1 means that the employee is dedicated solely to that project, 1.4 means that the employee will be spending extra hours on that project and 0.1 indicates that the employee will be working on multiple other projects.

The objective is to create another column (column K in the second screenshot) which will show the number of hours the employee will spend on the project.  The number of hours will be computed as number of working days in a month (treat Saturday and Sunday as weekends) * time allocation to that project (the numbers in range E2:J8) * 8.5 hours per day for an Offshore project and 8 hours per day for other projects.

The raw data sheet looks like this

The expected result is

The figure in cell K3 has been computed as:

• Number of working days between November 11, 2018 and November 30, 2018 are 15.  So 15 * 1 = 15
• Number of working days between December 1, 2018 and December 12, 2018 are 8.  So 8 * 0.5 = 4
• Total effective working days are 15 + 4 = 19
• Since it is an Offshore project, the hours per day would be 8.5.  Therefore total effective hours: 19 * 8.5 = 161.5

I have solved this problem using 3 methods:

1. Excel formulas - Refer worksheet named "Formula output"
2. Power Query and PowerPivot - Refer worksheet named "Power Pivot output"
3. Power Query only - Refer worksheet named "Power Query output"

Visualise a 3 column dataset as shown below - Location, Product and Sales.  Each location can have multiple products (Product A has Banana, Apple and Carrot) and each product can be sold in multiple locations (Banana is sold in locations A, B and F).

The objective is to determine the location with highest sales for each product.  So for Banana, maximum sale value is 25 and location of maximum sales value is B.  Likewise for Orange, maximum sales value is 49 and location of maximum sales value is A.  The expected result is:

I have 4 solutions to this problem:

1. Advanced Filters - This is a static solution.  For any changes in the source data range, one will have to re-enter the 3 inputs in the Advanced Filter window
2. Formulas - This is a semi-dynamic solution.  To make it fully dynamic, one will have to write an array formula to first extract all unique product names in a column.  The array formula to extract product names in a column can be obtained from here.
3. Power Query - This is a dynamic solution.  For any changes in the source data sheet, one just has to go to Data > Refresh All
4. PowerPivot - This is a dynamic solution.  For any changes in the source data sheet, one just has to go to Data > Refresh All

Consider a simple dataset as shown below:

For each tier, the tier rate is incrementally applied to the volume within the tier volume range.  Given the following transaction volumes, one may want to compute the transaction fee

The expected result is shown below

As one can observe, for a transaction value of 400,000, the fee has been computed as 3% on the first 100,000 and 2.5% on the next 300,000. You may download my solution workbook from here. In the file, I have shared 2 solutions - a conventional formula based one and a PowerPivot solution.

I have also solved a similar question here.

Assume a three column dataset with Patient ID, Smoking Status and Review Date

 PatientID SmokingStatus ReviewDate P1 10-03-2018 P1 9 09-03-2018 P1 1 08-03-2018 P1 4 07-03-2018 P2 9 10-03-2018 P2 9 09-03-2018 P2 9 08-03-2018 P2 9 07-03-2018 P3 2 10-03-2018 P3 09-03-2018 P3 9 08-03-2018 P4 9 10-03-2018 P4 1 09-03-2018 P4 4 08-03-2018

The objective is the create another 3 column dataset with the following conditions:

1. If the patient's latest smoking status is other than Blank or 9, then consider that as the smoking status of the patient; and
2. If the patient's latest smoking status is blank or 9, then consider the previous smoking status that is not blank or 9; and
3. If the patient's smoking status is blank or 9 on all dates, then consider the smoking status as 9

The expected result is:

 PatientID Last date when the smoking status was other than 9 or Blank Smoking status on that date P1 08-Mar-18 1 P2 10-Mar-18 9 P3 10-Mar-18 2 P4 09-Mar-18 1

I have solved this question using 3 methods - PowerPivot, Advanced Filters and formulas.  You may download my solution workbook from here.

Here is a simple three column dataset showing Item ID, Catalogue Name and Catalogue Year

 Item ID Catalogue Name Catalogue Year A Spring 2012 A Spring 2016 A Spring 2017 B Fall 2014 B Fall 2014 B Fall 2016

The second and third last rows are clear duplicate rows. The task is to create a fourth column (titled as All Year) in this dataset with concatenated entries from the Catalogue Year column for each Item ID. So the expected result is:

 Item ID Catalogue Name Catalogue Year All Year A Spring 2012 2012, 2016, 2017 A Spring 2016 2012, 2016, 2017 A Spring 2017 2012, 2016, 2017 B Fall 2014 2014, 2016 B Fall 2014 2014, 2016 B Fall 2016 2014, 2016

Notice that for Item ID B, the result in every cell of the fourth column is 2014, 2016 and not 2014,2014,2016.  So while joining (concatenating in technical MS Excel terminology) entries from the "Catalogues Year" column in the "All Year" column, only unique entries for that Item ID have to be considered.

Excel 2016 has a TEXTJOIN() function which gets use close to the final result but does not take into account the unique entries only (it takes all entries into account all entries).  On using the TEXTJOIN() array formula in a spare column, here is the result i got:

 Item ID Catalogue Name Catalogue Year All Year A Spring 2012 2012, 2016, 2017 A Spring 2016 2012, 2016, 2017 A Spring 2017 2012, 2016, 2017 B Fall 2014 2014, 2014, 2016 B Fall 2014 2014, 2014, 2016 B Fall 2016 2014, 2014, 2016

Notice that for Item ID B, the result is 2014, 2014, 2017 and not 2014, 2017.  So the TEXTJOIN() function does not work.

I have solved this problem using Power Query (Get and Transform in Excel 2016).  You may download my solution workbook from here.

Imagine a multi column exam invigilation schedule with the following information

1. S. No.
2. Name of staff
3. Designation of staff member
4. Two columns for each day on which there is an exam - one for Morning and another for Afternoon
5. A * under each column if that particular staff member has to be an invigilator during that time period
 S. No. Name of staff Designation 09/11/2017-Morning 09/11/2017-Afternoon 10/11/2017-Morning 10/11/2017-Afternoon 1 Tom Lecturer * * * 2 Sam Lecturer * *

This dataset stretches into many more columns.  So if the exams last for 15 days, there will be 30 columns.  The objective is to condense the column expanding dataset into a 5 column one - S. No., Name of Staff, Designation, Morning and Evening.  Under the Morning and Evening columns, the different dates have to be separated by commas in that one single cell.  The expected result is:

 Name of staff S. No. Designation Morning Afternoon Tom 1 Lecturer 09/11/2017, 10/11/2017, 11/11/2017, 13/11/2017, 14/11/2017, 16/11/2017, 17/11/2017 10/11/2017, 13/11/2017, 14/11/2017, 15/11/2017, 16/11/2017, 17/11/2017 Sam 2 Lecturer 09/11/2017, 13/11/2017, 15/11/2017, 16/11/2017 09/11/2017

I can think of 3 ways to solve this problem. Here's a brief about each of those methods:

1. Using Power Query and Excel's functions - Using Power Query, one can first unpivot the data and then use the TEXTJOIN() function.  This would be an array formula.
2. Using Power Query only - Using Power Query, one can first unpivot the data and then modify the code to concatenate all dates in a single cell for that invigilator.  One can then Pivot the data in Power Query itself
3. Using Power Query and PowerPivot - Using Power Query, one can first unpivot the data and then use the CONCATENATEX() function of the PowerPivot.

As you can observe, the common thread in all solutions is that one has to first unpivot the dataset.

You may refer to all my 3 solutions in this workbook.

Here is a simple matrix like data layout which shows the features available in every product.  Let's assume that this data is in range A2:E8 (including the header row)

 Functions Product A Product B Product C Product D Function 1 x Function 2 x x Function 3 x x x Function 4 x x Function 5 x x Function 6 x

Here is a user created input table of his/her requirements.  Let's assume that this data is in range A12:B18 (including the header row)

 Functions Input Function 1 Function 2 x Function 3 x Function 4 Function 5 x Function 6

The expected result is the "Product" which meets the user defined function combinations. The result should be Product B. If there are different products which meet the user's requirements, only the first one will be returned as a result. If one would like all products to be returned, the one can use Power Query to resolve this problem.

=IFERROR(INDEX(\$B\$2:\$E\$2,1,MATCH(COUNTA(\$A\$13:\$A\$18),MMULT(1*(TRANSPOSE(B3:E8=B13:B18)),1*(ISNUMBER(ROW(INDIRECT("1:"&COUNTA(\$A\$3:\$A\$8)))))),0)),"No such product")

Please note that this is an array formula so please confirm the formula with Ctrl+Shift+Enter.

Imagine a fixed monthly amount due to an Organisation for services rendered to various customers.  While an invoice is raised every month by this Organisation, not all pay up the dues on time.  For unpaid dues, the Organisation charges its client interest ranging from 3% to 9% per annum.  The objective is to determine cumulative interest payable by various customers to Organisation X.

The base data looks like this

 Client Monthly revenue Int. calculation start date Int. calculation end date Interest rate Client A 33,967 01-Aug-16 25-Jul-17 9.00% Client B 123 12-Sep-16 30-Nov-17 4.00%

Given the dataset above, the total interest payable by Client A is Rs. 16,237.20.  The calculation is shown below:

 From To Days for which interest should be paid Principal Interest 02-Aug-16 31-Aug-16 328.00 33,967.00 2,745.26 01-Sep-16 30-Sep-16 298.00 33,967.00 2,494.17 01-Oct-16 31-Oct-16 267.00 33,967.00 2,234.71 01-Nov-16 30-Nov-16 237.00 33,967.00 1,983.62 01-Dec-16 31-Dec-16 206.00 33,967.00 1,724.16 01-Jan-17 31-Jan-17 175.00 33,967.00 1,464.70 01-Feb-17 28-Feb-17 147.00 33,967.00 1,230.34 01-Mar-17 31-Mar-17 116.00 33,967.00 970.88 01-Apr-17 30-Apr-17 86.00 33,967.00 719.79 01-May-17 31-May-17 55.00 33,967.00 460.33 01-Jun-17 30-Jun-17 25.00 33,967.00 209.24 01-Jul-17 25-Jul-17 - 33,967.00 - Total 16,237.20

You may download my solution workbook with from here. I have solved this problem using normal Excel formulas and the PowerPivot.

Here is a sample dataset of learners who have cleared different stages of multiple courses on offer within an Organisation:

 Learner Stage completed Course Bill Stage 1 Public Speaking Bill Stage 2 Public Speaking Bill Stage 3 Public Speaking Susan Stage 1 Effective Communication Bob Stage 1 Public Speaking Bob Stage 2 Public Speaking Sheila Stage 1 Effective Communication Sheila Stage 2 Effective Communication Sheila Stage 3 Effective Communication Frank Stage 1 Effective Communication Frank Stage 2 Effective Communication Henry Stage 1 Public Speaking Henry Stage 2 Public Speaking Bill Stage 1 Effective Communication Bill Stage 2 Effective Communication

From this sample dataset, one may want to know how many participants have completed each stage of these multiple courses.  The expected result is shown below:

 Row Labels Stage 1 Stage 2 Stage 3 Effective Communication 1 2 1 Public Speaking 2 1 Grand Total 1 3 2

In this workbook, I have shared 2 solutions - one using formulas and the other using the Power Query & PowerPivot.

Refer to a simple 5 column representative inventory dataset of a Glass manufacturer:

 Model Length (MM) Wide (MM) Thk (MM) CAT HX9-G-ARD 1071 273 3.5 A MYP-G-3RD 580 535 3.2 B EPO-G-3RD 580 535 3.2 A MYG-G-3R 966 350 3.2 A MYN-G-3RD 649 530 3.2 A GM SPIN-G-3FD 882 395 3.2 A MY8-G-AR 880 400 3.5 B GM2-G-AR 880 400 3.5 A

From this inventory data, one has to furnish customer orders based on specific dimensions demanded by them.  A typical Customer request would be to supply glass sheets as per the following dimensions

 Length (MM) Wide (MM) Thk (MM) CAT 780 542 3.5 A

The firm may or may not have glass sheets of this specific size.  The objective is to identify glass sheets, from the inventory on hand, which match customer specifications.  If there is no exact match, then one must be able to obtain all inventory items which have the same Thk (MM) and CAT as the customer specified dimensions but the Length and thickness should be more than equal to the customer specified dimensions.  The length and width can then be trimmed to match the exact customer dimensions.  Furthermore, the result returned should:

1. List only the Top 30 glass sheets available in inventory; and
2. List those Top 30 glass sheets in ascending order of wastage (wastage caused when the glass sheet is trimmed to match the customer specified dimensions)

You may refer to my solution in this workbook.  I have shared two solutions - one using Excel formulas and the other using Power Query a.k.a. Get and Transform in Excel 2016.  Please read the Comments in cells F1, J9 and J16 of the "Solutions" worksheet.  The difference between the 2 solutions is:

1. Formula driven solution - This is in range J10:AM14 of the Solutions worksheet.  This is a semi dynamic solution (as compared to the Power Query solution).  To get the models in ascending order of wastage, one will have to create an Area column in the base data and sort that column in ascending order.
2. Power Query solution - This is in range J17:AM21 of the Solutions worksheet.  This is a dynamic solution.  Just change the customer specified dimensions in range G2:J2 of the Data and Query worksheet.  Thereafter just right click on any cell in the range below and select refresh.