Tags: MATCH

Return the specific product which satisfies the user defined feature combination

{2 Comments}

Here is a simple matrix like data layout which shows the features available in every product.  Let's assume that this data is in range A2:E8 (including the header row)

Functions Product A Product B Product C Product D
Function 1 x
Function 2 x x
Function 3 x x x
Function 4 x x
Function 5 x x
Function 6 x

Here is a user created input table of his/her requirements.  Let's assume that this data is in range A12:B18 (including the header row)

Functions Input
Function 1
Function 2 x
Function 3 x
Function 4
Function 5 x
Function 6

The expected result is the "Product" which meets the user defined function combinations. The result should be Product B. If there are different products which meet the user's requirements, only the first one will be returned as a result. If one would like all products to be returned, the one can use Power Query to resolve this problem.

=IFERROR(INDEX($B$2:$E$2,1,MATCH(COUNTA($A$13:$A$18),MMULT(1*(TRANSPOSE(B3:E8=B13:B18)),1*(ISNUMBER(ROW(INDIRECT("1:"&COUNTA($A$3:$A$8)))))),0)),"No such product")

Please note that this is an array formula so please confirm the formula with Ctrl+Shift+Enter.

Determine number of learners who have completed different stages of multiple online courses

{4 Comments}

Here is a sample dataset of learners who have cleared different stages of multiple courses on offer within an Organisation:

Learner Stage completed Course
Bill Stage 1 Public Speaking
Bill Stage 2 Public Speaking
Bill Stage 3 Public Speaking
Susan Stage 1 Effective Communication
Bob Stage 1 Public Speaking
Bob Stage 2 Public Speaking
Sheila Stage 1 Effective Communication
Sheila Stage 2 Effective Communication
Sheila Stage 3 Effective Communication
Frank Stage 1 Effective Communication
Frank Stage 2 Effective Communication
Henry Stage 1 Public Speaking
Henry Stage 2 Public Speaking
Bill Stage 1 Effective Communication
Bill Stage 2 Effective Communication

From this sample dataset, one may want to know how many participants have completed each stage of these multiple courses.  The expected result is shown below:

Row Labels Stage 1 Stage 2 Stage 3
Effective Communication 1 2 1
Public Speaking 2 1
Grand Total 1 3 2

In this workbook, I have shared 2 solutions - one using formulas and the other using the Power Query & PowerPivot.

Return best possible fit, to manually entered dimensions, with the intent to minimise wastage

{5 Comments}

Refer to a simple 5 column representative inventory dataset of a Glass manufacturer:

Model Length (MM) Wide (MM) Thk (MM) CAT
HX9-G-ARD 1071 273 3.5 A
MYP-G-3RD 580 535 3.2 B
EPO-G-3RD 580 535 3.2 A
MYG-G-3R 966 350 3.2 A
MYN-G-3RD 649 530 3.2 A
GM SPIN-G-3FD 882 395 3.2 A
MY8-G-AR 880 400 3.5 B
GM2-G-AR 880 400 3.5 A

From this inventory data, one has to furnish customer orders based on specific dimensions demanded by them.  A typical Customer request would be to supply glass sheets as per the following dimensions

Length (MM) Wide (MM) Thk (MM) CAT
780 542 3.5 A

The firm may or may not have glass sheets of this specific size.  The objective is to identify glass sheets, from the inventory on hand, which match customer specifications.  If there is no exact match, then one must be able to obtain all inventory items which have the same Thk (MM) and CAT as the customer specified dimensions but the Length and thickness should be more than equal to the customer specified dimensions.  The length and width can then be trimmed to match the exact customer dimensions.  Furthermore, the result returned should:

  1. List only the Top 30 glass sheets available in inventory; and
  2. List those Top 30 glass sheets in ascending order of wastage (wastage caused when the glass sheet is trimmed to match the customer specified dimensions)

You may refer to my solution in this workbook.  I have shared two solutions - one using Excel formulas and the other using Power Query a.k.a. Get and Transform in Excel 2016.  Please read the Comments in cells F1, J9 and J16 of the "Solutions" worksheet.  The difference between the 2 solutions is:

  1. Formula driven solution - This is in range J10:AM14 of the Solutions worksheet.  This is a semi dynamic solution (as compared to the Power Query solution).  To get the models in ascending order of wastage, one will have to create an Area column in the base data and sort that column in ascending order.
  2. Power Query solution - This is in range J17:AM21 of the Solutions worksheet.  This is a dynamic solution.  Just change the customer specified dimensions in range G2:J2 of the Data and Query worksheet.  Thereafter just right click on any cell in the range below and select refresh.

Generate a list of assignees for different projects based on a competency matrix

{0 Comments}

Here is a small sample of a Project matrix which shows tasks to be accomplished for various projects.  There can only be upto 6 tasks per project.

Project Name Task1 Task2 Task3 Task4 Task5 Task6
Project1 Painting Chef Gardener
Project2 Tiling Digging Engineering
Project3 Mechanic Engineering

Here is a competency matrix showing the competencies of employees on different tasks.  1 indicates that the employee is competent to perform that task.

Task Tom Jane Mary Paddy Lynda
Painting 1 1 1 1 1
Tiling 1 1 1 1 1
Plastering 1 1 1 1 1
Digging 1 0 1 1 1
Mechanic 1 1 1 0 1
Detective 1 1 1 1 1
Engineering 1 1 0 1 1
Boxer 1 0 1 1 1
Chef 1 1 1 1 1
Gardener 1 1 0 1 1
Banker 1 1 1 1 0

From these two tables, one may want to generate another table showing which employees can be assigned to which project (only those employees should be assigned to a project who can complete all tasks).  So the ideal solution is to create another column (8th column) in the Project matrix table above which should have a drop down (Data > Data Validation) for every project showing which employees are competent for that project.

Here's an illustration:

Assuming that the Project matrix is in range A1:G4 (headers are in row 1)

  1. In cell H2 (for Project1), the drop down should show Jane, Lynda, Paddy and Tom.  Mary should not appear there because she cannot perform one of the 3 tasks required to complete the project i.e. Gardener.
  2. In cell H3 (for Project2), the drop down should show Lynda, Paddy and Tom.  Jane and Mary should not appear there because they cannot perform the Digging and Engineering tasks respectively.

The solution is dynamic for the following:

  1. Projects added to the Project matrix Table; and
  2. Tasks added (upto 6 only) or edited in the Project matric Table; and
  3. Employees added to the Competency matrix Table; and
  4. Tasks added to the Competency matrix Table

I have solved this problem by using:

  1. Power Query; and
  2. Formulas in Data > Data Validation.

You may download my solution workbook from here or here.

Compute potential Sales of a retail outlet

{8 Comments}

The objective is to assist a Store Manager with computing potential sales across different products and colours.  To start with let's assume two datasets:

1. Customer-Colour dataset - a two column table which lists down the colour preference of each customer; and
2. Colour-Product-Price dataset - a three column table which lists down the multiple products associated with each colour and the associated prices

My objective is to compute the "Potential Sales" if each customer buys one unit of each colour of each product.  While this can be resolved by using formulas (see Formula solution worksheet of the workbook), I'd like to resolve this problem with the help of the Power Query and PowerPivot tools.

The initial challenge will be to establish a relationship between the two tables because one cannot establish Many to Many relationships in a PowerPivot.  The two tables above are a perfect example of Many to Many relationships because each customer likes multiple colours and each colour is associated with multiple products.

You may download my solution workbook from this link.

You may also view a short video of my solution here:

Extract City, State and Pin code from an address string

{40 Comments}

For a lot of Excel users, a common problem is to extract City, State and Pin Code from an address string.  What compounds the issue is that there is no standardisation in an address string.  For e.g., one may end the address with a Pin code, while others may end it with a State and Country.  Some other variations could be:

1. End the address with Contact Numbers
2. There are no spaces between City, State and Pin code

There are of course many other variations which are possible and the sheer number of these variations makes it difficult to list down all of them.

Here is an example of address strings from where the State, City and Pin code need to extracted in three columns:

1. F-45, Pocket 1, Sector 198, Noida - 201303, Uttar Pradesh, India
2. S-45/B, Pocket 1-C, Phase II, Sector 189, Gurgaon, Haryana - 122002
3. RR-45/B, Pocket II-S, Cross 2, Noida - 201303, India
4. T-45, Sector 198, Lucknow - 226001, Uttar Pradesh, India
5. V-45(A), Sector 193C,Allahabad-211002 Uttar Pradesh India
6. V-45(A), Sector 193C,Allahabad-211002Uttar PradeshIndia

You may refer to my solution in this workbook.

Prioritise investment liquidation to minimise Capital Gains

{2 Comments}

Assume a five column dataset with the Fund Name in column A, Market Value in column B, Gain/Loss in column C, Gain/Loss % in column D [column C/column B] and Type of investment in column E (Short term or Long term).  Since multiple investments can be made in one fund, there can be multiple line items for with the same Fund Name (with different market values).  Let's call this dataset the Investment schedule.

The objective is to determine the following:

a. Minimum total gain realised from sale of one/multiple funds; and
b. Short term gain present in a above; and
c. Total Long term gain realised from sale of one/multiple funds; and
d. If Long term investments fall for liquidating the particular fund, then how much is the gain/loss by forcible liquidating the short term investments.

You may refer to my solution in this workbook.

Compute configuration count using Set Theory and Venn Diagrams

{0 Comments}

Assume a table which lists attendees for a Company's Annual day function.  In this Table, data for every attendee is shown on a separate row so if an employee attends the function with his/her spouse and three children, then there will be 5 rows for that employee.

The question is to determine the count of the following family configuration:

1. Employees only (those who attended without spouse and children); and
2. Employees, spouse and children (Family); and
3. Employees and spouse (no children); and
4. Employees and children (no spouse)

You may refer to my solution in this workbook.  I have solved this problem using:

1. MS Excel Formulas based on Set Theory and Venn Diagram; and
2. PowerPivot

Create a Pivot Table from multiple individual ranges without using ancillary columns

{16 Comments}

Assume that there are three separate tables showing the following information:

1. Date of visit data for visitors to a certain recreation facility.  The same visitor may visit the facility multiple times
2. Fee per visit in different cites
3. Region in which each city lies

The question is to analyse the three tables above via a Pivot Table to generate the following:

1. Region wise and visitor wise:
a.  Fee per visit
b.  Frequency of visit
c.  Revenue
2. Revenue collected by month wise and by visitor

In this workbook, I have shared two solutions:

1. PowerPivot solution - This solution answers both questions above without using any calculated/ancillary columns in the base data.  By establishing simple relationships in the PowerPivot window and by writing two calculated Field formulas, both questions above have been answered.  To use this PowerPivot solution, you need to be using the PowerPivot add-in for MS Excel.  This add-in is only available for Excel 2010 and higher versions.

2. Pivot Table solution - This solution answers both questions above by using calculated/ancillary columns in the base data.

Here's another example.  Assume a four column table showing Date of session, Client, Location and Participant Name.  Assume another four column table showing Client, Date of session, Date of invoice and Amount Billed.  The task is to determine the Amount billed per location.  You may refer to my PowerPivot and Pivot Table solution in this workbook.

Perform an iterative sum of Top n values across multiple columns

{4 Comments}

A tournament has 18 participating teams with 25 players in each team.  Each team has to play five rounds of the Tournament and not all players play all rounds.  Scores earned by each player in each round are shown in individual cells.  If a player does not play a round, that cell is left empty.

The task is to sum the highest 18 scoring players for each round.  Only the highest 18 players per team count towards the teams score.  If few of the players have the same score at position 18 then only one of them should be included in the overall score.

One solution is to sort each round of scores for each team in descending order and sum the highest 18 values.  This is obviously a time consuming process.

There could be two others ways to solve this problem

Formula driven solution - This uses a spare column, a lengthy formula and the Data > Table functionality.  Since Data > Table is a series of array formula, this solution makes the workbook very sluggish.  You may refer to my solution in this workbook.

PowerPivot solution - This solution is far better than the formula driven one in as much as no spare columns, lengthy formulas or Data > Tables have been used.  The solution in this workbook adds the scores of the highest 18 scoring players per round (If few of the players have the same score at position 18 then all scores are included in the overall score).  To use this PowerPivot solution, you need to be using the PowerPivot add-in for MS Excel.  This add-in is only available for Excel 2010 and higher versions.