# Tags: VALUES

Imagine a dataset like this.  This dataset shows vendors that submitted proposals for supplying various parts to a Company.  There is one column for each of the twelve months.

Via a simple Pivot Table, one can determine the lowest bidding vendor per product (part) for any chosen month.  However, one may also want to know the names of those vendors for each product (as seen in column G below).  Notice, that Vendor 2 and Vendor 3 submitted the lowest bid for Product 1 and therefore both names should appear in the result.

I have solved this problem using PowerPivot and Power Query a.k.a. Data > Get & Transform in Excel 2016.  You may download my solution workbook from here.

Refer to this simple Sales dataset

The objective is to create a simple matrix with months in the row labels, years in the column labels and sales figures in the value area section.  The twist in the question is that for years prior to the current year (2018 in this dataset), sales should only appear till the month for which there is data for the current year.  For e.g., for 2018, data is only till Month 4 and therefore for prior years as well, data should only appear till Month 4.  As and when Sales data gets added below row 17, data for prior years should also go up to that month.

The expected result is

You may download my PBI file from here. The same solution can be obtained in Excel as well (using Power Query and PowerPivot).

This table contains a list of all the inspections created and completed within different time periods.

The objective is to create two Tables from this single table - one showing the Accounts created within the chosen time period and another showing the those that were closed within the same time period.  Here are screenshots of the expected results.

You may download my PowerBI desktop solution workbook from here.  The same solution can be obtained in Excel as well (using Power Query and PowerPivot).

Consider a simple dataset as shown below:

For each tier, the tier rate is incrementally applied to the volume within the tier volume range.  Given the following transaction volumes, one may want to compute the transaction fee

The expected result is shown below

As one can observe, for a transaction value of 400,000, the fee has been computed as 3% on the first 100,000 and 2.5% on the next 300,000. You may download my solution workbook from here. In the file, I have shared 2 solutions - a conventional formula based one and a PowerPivot solution.

I have also solved a similar question here.

Visualise a 5 column dataset as show below.  This is a very small sample of the actual dataset.  It shows the date on which supplies were received for each item from Vendors and whether those supplies had errors in them.  Finally those identified errors have been bucketed into relevant categories.  The Item ID# is a code assigned by the Client Organisation.  The format of the ID# is X [Last two digits of year of receipt] [Four digit Number].  For items received from a certain vendor, the last four digits of any item will always be in ascending order of the Date received.  In other words. for items received from a certain vendor, the last 4 digits of an item received on 13 January 2017 will always be greater than the last 4 digits of an item received on 11 January 2017 from the same vendor.  There will never be repetitions in the Item ID# column.

 Vendor Name Item ID# Date Received Error with Item? Error Category California X170016 1-16-2017 No California X170014 1-13-2017 Yes Labeling Error California X170015 1-13-2017 Yes Packaging Error California X170008 1-9-2017 Yes Quality Issue California X170003 1-2-2017 No California X160645 12-26-2016 Yes Packaging Error California X160646 12-26-2016 No California X160644 12-25-2016 Yes Labeling Error California X160638 12-20-2016 Yes Quality Issue California X160633 12-15-2016 No California X160626 12-8-2016 No California X160625 12-7-2016 Yes Packaging Error California X160624 12-5-2016 Yes Labeling Error California X160618 11-23-2016 Yes Quality Issue California X160613 11-13-2016 No California X160606 10-30-2016 No

The objective is to compute the error rate by vendor and Error category for the 10 most recent transaction dates with that specific vendor.  So, for vendor Name "California" and  Error category as "Packing Error", this ratio should be computed as = Number of packing Errors on 10 most recent dates/10.

Here is a simple snapshot of the Data for California.  I have filtered the dataset where Vendor Name is California and then sorted the Date received column in descending order.  Please note that when i filter the dataset on California, a lot more rows are returned.  I am only showing the Top 10 rows here because that is what is important for solving this question.

 Vendor Name Item ID# Date Received Error with Item? Error Category California X170016 1-16-2017 No California X170014 1-13-2017 Yes Labeling Error California X170015 1-13-2017 Yes Packaging Error California X170008 1-9-2017 Yes Quality Issue California X170003 1-2-2017 No California X160645 12-26-2016 Yes Packaging Error California X160646 12-26-2016 No California X160644 12-25-2016 Yes Labeling Error California X160638 12-20-2016 Yes Quality Issue California X160633 12-15-2016 No

The expected result is:

 Row Labels Labeling Error Packaging Error Quality Issue Factory Error Boise 30.00% California 20.00% 20.00% 20.00% Detroit 70.00% 30.00% Ekalaka 20.00% 20.00%

I have solved this problem using the PowerPivot. You may download my solution workbook from here.

Here is a simple 3 column dataset showing Categories, Date and Value

 Catagorie Date Value Fish 08-12-2015 6 Crab 05-12-2015 7 Crab 04-12-2015 6 Bird 27-11-2015 4 Snow 25-11-2015 10 Cat 21-11-2015 7 Dog 12-11-2015 5 Dog 28-10-2015 5 Fish 12-10-2015 3 Bird 11-10-2015 9 Dog 22-09-2015 9 Crab 17-08-2015 8 Cat 11-08-2015 1 Fish 20-07-2015 5 Crab 03-07-2015 2 Fish 02-06-2015 8 Fish 26-05-2015 9 Dog 14-05-2015 4 Snow 07-05-2015 7 Bird 03-05-2015 9 Cat 20-04-2015 10 Cat 15-04-2015 2 Snow 13-04-2015 3 Crab 29-03-2015 5 Crab 23-03-2015 2 Bird 14-03-2015 5 Cat 14-03-2015 1 Dog 26-02-2015 9 Fish 16-02-2015 4 Fish 08-02-2015 6 Bird 18-01-2015 1 Snow 06-01-2015 10

The objective is to compute category wise average of values against the 5 most recent dates.  So the expected result is:

 Categorie Category wise average values on 5 most recent dates Bird 5.60 Cat 4.20 Crab 5.60 Dog 6.40 Fish 6.20 Snow 7.50

So for the Category of Fish (dates are sorted in descending order), the average should be 6.2 (average of 6,3,5,8,9)

 Catagorie Date Value Fish 08-12-2015 6 Fish 12-10-2015 3 Fish 20-07-2015 5 Fish 02-06-2015 8 Fish 26-05-2015 9 Fish 16-02-2015 4 Fish 08-02-2015 6

I have solved this problem using the PowerPivot. You may download my solution workbook from here.

Here's a simple 3 column table showing Date, Project name (Cat.) and Status of the project.  Each project can have multiple status entries on different dates.  So as you can observe, project "alpha_9383993" was In Progress on Oct 2, 2017, remained so on October 5, 2017 but was completed on October 6, 2017.

 Date Cat. Status 02-Oct-17 alpha_9383993 In Progress 03-Oct-17 Pulse_9387388 In Progress 04-Oct-17 Pulse_9387388 Rework 05-Oct-17 alpha_9383993 In Progress 06-Oct-17 alpha_9383993 Completed 07-Oct-17 Pulse_9387388 Completed 08-Oct-17 Oppo_tes_9383 In Progress 09-Oct-17 Oppo_Max_8977 Rework

The objective is to determine the count of projects by Status as per the most recent status of every project.  So the expected result is:

 Row Labels measure 2 Completed 2 In Progress 1 Rework 1

The result for In Progress should be one because there is only one such project - Oppo_tes_9383.  Project alpha_9383993 should not be counted because it was completed on October 6, 2017.  Likewise the result for Rework should be one because there is only one such project - Oppo_Max_8977.  Project Pulse_9387388 should not be counted because it was completed on October 7,2017.

I have solved this problem with the PowerPivot.  You may download my solution workbook from here.

Here's a simple two column dataset

 Comment Identifier Intervals A 3pm-6pm A 9pm-12pm S 3pm-6pm S 3pm-6pm S 9pm-12pm A 9pm-12pm S 9pm-12pm D 3pm-6pm A 9pm-12pm A 9pm-12pm A 9pm-12pm A 3pm-6pm A 3pm-6pm

For identifiers listed in column A, there are time intervals in column B. Note that for a certain identifier, a time interval can appear multiple times. The objective is two-fold:

1. For each identifier, show the time interval which appears most frequently; and
2. For each identifier, compute the count of the time interval which appears most frequently

The expected result is:

As you can observe, the result for S is two time periods - 3pm-6pm and 9pm-12pm.  This is because each of them appears twice.

Here is a simple two column dataset which shows EmpID in column A and DateWorked in column B.  My objective is to filter the table to show those employees who worked ONLY on August 17 and August 18.  I'd like to exclude employees who:

1. Did not work on both these days; and
2. Worked on both these dates but also worked on other dates

A simple filter on the DatesWorked column with the criteria of August 17 and August 18, would return 7 employees.  This is clearly wrong because this will show employees who:

1. Worked on any of the two chosen dates; and
2. Worked on dates other than the two chosen dates

From Table 1, one can clearly observe that the correct result should be EmpID E.  Only EmpID E worked on August 17 and 18.  While others also worked on both these dates, they worked on other dates as well and should therefore be excluded.

The problem outlined above holds true for selections made via a slicer in a Pivot Table as well.  So this is a generic problem with a filter/slicer - conditions specified therein behave like OR conditions, not like AND conditions.

Base dataset

 EmpID DateWorked A 14-08-2017 A 15-08-2017 A 16-08-2017 A 17-08-2017 A 18-08-2017 B 15-08-2017 B 16-08-2017 B 18-08-2017 C 14-08-2017 C 15-08-2017 C 16-08-2017 C 17-08-2017 C 18-08-2017 D 14-08-2017 D 15-08-2017 D 16-08-2017 D 17-08-2017 D 18-08-2017 E 17-08-2017 E 18-08-2017 F 16-08-2017 F 17-08-2017 G 18-08-2017

Filtered dataset (after specifying the criteria as 17 Aug and 18 Aug)

 EmpID DateWorked A 17-08-2017 A 18-08-2017 B 18-08-2017 C 17-08-2017 C 18-08-2017 D 17-08-2017 D 18-08-2017 E 17-08-2017 E 18-08-2017 F 17-08-2017 G 18-08-2017

So the question is "How does one change the behavior of a filter/slicer from OR to AND?". To make things worse, the data is arranged as follows:

 EmpID DateWorked A 14/8/17;15/8/17;16/8/17;17/8/17;18/8/17 B 15/8/17;16/8/17;18/8/17 C 14/8/17;15/8/17;16/8/17;17/8/17;18/8/17 D 14/8/17;15/8/17;16/8/17;17/8/17;18/8/17 E 17/8/17;18/8/17 F 16/8/17;17/8/17 G 18-08-17

As you can see, there are multiple dates in a single cell separated by a ;.

You may refer to my Power Query and PowerPivot solution here.

Assume a dataset with two columns which lists down the student names in column A and courses opted for in column B.  Since one student can opt for multiple courses and the same course can be taken up by multiple students, there can be repetitions in both columns.  The objective is to create a matrix like data structure (with courses appearing in both row and column labels) with numbers inside the matrix quantifying the "Number of students who opted for course A and C".  So, for all possible course combinations, one may want to know the number of students who opted for those combinations.

The description above can be extended to cases where buying behavior has to be analysed.  A sore manager may want to know "How many people who buy Brand A also buy Brand B."

Here's a snapshot of the source data and expected result

The number 1 in cell H4 (and cell F6) means that there is only one student who opted for courses B and D.  Likewise, 3 in cell H5 (and cell G6) means that 3 students opted for courses C and D.

You may refer to my Power Query and PowerPivot solution in this workbook.  Power Query has been used for generating a dynamic list of Courses and Power Pivot has been used for writing the DAX formula for quantifying within the matrix.