Tags: MATCH

Here is a simple 5 column dataset showing basic employee information.

The objective is to show only those rows of data in which the Score (column D) is greater then 3.  While one can solve this with a simple filter, the solution will not be dynamic.  To get a dynamic solution, one may use the FILTER() dynamic array function that comes with the Microsoft 365 subscription service.

In cell G2, one may simply write this formula

=FILTER(A2:E9,D2:D9>3)

This is a far better solution because it is formula driven and thereby dynamic.  So all good till here.  Now let's make it a little interesting.

The revised objective is to show only those rows of data in which the Score (column D) is greater then 3 and only display 3 columns - Name, Department and Score (columns 1,3 and 4) in the end result.  This can be solved using Data > Advanced filter but the result will be static.  To get a dynamic solution, one may use a nested FILTER() function in cell G2

=FILTER(FILTER(A2:E9,D2:D9>3),{1,0,1,1,0})

This formula will return the same number of rows (3 rows) as the previous FILTER() function returned with only 3 columns - Name, Department and Score.  The 1's and 0's in the formula denote whether one would like to see the particular column in the end result or not.  So once again, all good till here.

Now, let's take it a notch higher.  What is there were 35 columns in the original dataset and one wanted to see data appearing in columns 1,6,22,25,29 and 34.  It will be quite time consuming to enter the 1's and 0's in order within the FILTER() function.  So the question here is how does one save time and effort?  Ideally one should be able to just enter the column headings one wants to see in the end result.

As you can observe in the image above, one has to simply specify the columns to extract in column C and the result populates from column K rightwards and downwards.  Using dynamic array formulas and the FILTER() function, one saves effort in entering 1's and 0's in the FILTER() function (the FILTER() function has been written in cell K2 - download link of the file is shared below).  If one types Address in cell I5, then Address would automatically appear in cell N1 and so will the entries in range N2:N4.  So this does seem like a good solution.  So while it is good, it is not a perfect solution.  In column I, if one changes the order of the headings i.e. one enters Name, Dept, Address and Score (rather than Name, Dept, Score and Address), the result under columns M and N would be incorrect (see image below).

Solving this challenge led me to using Data > Get & Transform.  I made use of the latest feature introduced in Power Query called "Data Types" (I received this feature update on December 4, 2020).

As one can see in the image above, one simply has to select any heading one wants and that appears in the next available column.

You may download my solution workbook from here in which i have shown both formula based and the Power Query solution.

Visualise a 3 column dataset as shown below - Location, Product and Sales.  Each location can have multiple products (Product A has Banana, Apple and Carrot) and each product can be sold in multiple locations (Banana is sold in locations A, B and F).

The objective is to determine the location with highest sales for each product.  So for Banana, maximum sale value is 25 and location of maximum sales value is B.  Likewise for Orange, maximum sales value is 49 and location of maximum sales value is A.  The expected result is:

I have 4 solutions to this problem:

1. Advanced Filters - This is a static solution.  For any changes in the source data range, one will have to re-enter the 3 inputs in the Advanced Filter window
2. Formulas - This is a semi-dynamic solution.  To make it fully dynamic, one will have to write an array formula to first extract all unique product names in a column.  The array formula to extract product names in a column can be obtained from here.
3. Power Query - This is a dynamic solution.  For any changes in the source data sheet, one just has to go to Data > Refresh All
4. PowerPivot - This is a dynamic solution.  For any changes in the source data sheet, one just has to go to Data > Refresh All

You may download my solution workbook from here.

Here is a simple matrix like data layout which shows the features available in every product.  Let's assume that this data is in range A2:E8 (including the header row)

 Functions Product A Product B Product C Product D Function 1 x Function 2 x x Function 3 x x x Function 4 x x Function 5 x x Function 6 x

Here is a user created input table of his/her requirements.  Let's assume that this data is in range A12:B18 (including the header row)

 Functions Input Function 1 Function 2 x Function 3 x Function 4 Function 5 x Function 6

The expected result is the "Product" which meets the user defined function combinations. The result should be Product B. If there are different products which meet the user's requirements, only the first one will be returned as a result. If one would like all products to be returned, the one can use Power Query to resolve this problem.

=IFERROR(INDEX(\$B\$2:\$E\$2,1,MATCH(COUNTA(\$A\$13:\$A\$18),MMULT(1*(TRANSPOSE(B3:E8=B13:B18)),1*(ISNUMBER(ROW(INDIRECT("1:"&COUNTA(\$A\$3:\$A\$8)))))),0)),"No such product")

Please note that this is an array formula so please confirm the formula with Ctrl+Shift+Enter.

Here is a sample dataset of learners who have cleared different stages of multiple courses on offer within an Organisation:

 Learner Stage completed Course Bill Stage 1 Public Speaking Bill Stage 2 Public Speaking Bill Stage 3 Public Speaking Susan Stage 1 Effective Communication Bob Stage 1 Public Speaking Bob Stage 2 Public Speaking Sheila Stage 1 Effective Communication Sheila Stage 2 Effective Communication Sheila Stage 3 Effective Communication Frank Stage 1 Effective Communication Frank Stage 2 Effective Communication Henry Stage 1 Public Speaking Henry Stage 2 Public Speaking Bill Stage 1 Effective Communication Bill Stage 2 Effective Communication

From this sample dataset, one may want to know how many participants have completed each stage of these multiple courses.  The expected result is shown below:

 Row Labels Stage 1 Stage 2 Stage 3 Effective Communication 1 2 1 Public Speaking 2 1 Grand Total 1 3 2

In this workbook, I have shared 2 solutions - one using formulas and the other using the Power Query & PowerPivot.

Refer to a simple 5 column representative inventory dataset of a Glass manufacturer:

 Model Length (MM) Wide (MM) Thk (MM) CAT HX9-G-ARD 1071 273 3.5 A MYP-G-3RD 580 535 3.2 B EPO-G-3RD 580 535 3.2 A MYG-G-3R 966 350 3.2 A MYN-G-3RD 649 530 3.2 A GM SPIN-G-3FD 882 395 3.2 A MY8-G-AR 880 400 3.5 B GM2-G-AR 880 400 3.5 A

From this inventory data, one has to furnish customer orders based on specific dimensions demanded by them.  A typical Customer request would be to supply glass sheets as per the following dimensions

 Length (MM) Wide (MM) Thk (MM) CAT 780 542 3.5 A

The firm may or may not have glass sheets of this specific size.  The objective is to identify glass sheets, from the inventory on hand, which match customer specifications.  If there is no exact match, then one must be able to obtain all inventory items which have the same Thk (MM) and CAT as the customer specified dimensions but the Length and thickness should be more than equal to the customer specified dimensions.  The length and width can then be trimmed to match the exact customer dimensions.  Furthermore, the result returned should:

1. List only the Top 30 glass sheets available in inventory; and
2. List those Top 30 glass sheets in ascending order of wastage (wastage caused when the glass sheet is trimmed to match the customer specified dimensions)

You may refer to my solution in this workbook.  I have shared two solutions - one using Excel formulas and the other using Power Query a.k.a. Get and Transform in Excel 2016.  Please read the Comments in cells F1, J9 and J16 of the "Solutions" worksheet.  The difference between the 2 solutions is:

1. Formula driven solution - This is in range J10:AM14 of the Solutions worksheet.  This is a semi dynamic solution (as compared to the Power Query solution).  To get the models in ascending order of wastage, one will have to create an Area column in the base data and sort that column in ascending order.
2. Power Query solution - This is in range J17:AM21 of the Solutions worksheet.  This is a dynamic solution.  Just change the customer specified dimensions in range G2:J2 of the Data and Query worksheet.  Thereafter just right click on any cell in the range below and select refresh.

Here is a small sample of a Project matrix which shows tasks to be accomplished for various projects.  There can only be upto 6 tasks per project.

 Project Name Task1 Task2 Task3 Task4 Task5 Task6 Project1 Painting Chef Gardener Project2 Tiling Digging Engineering Project3 Mechanic Engineering

Here is a competency matrix showing the competencies of employees on different tasks.  1 indicates that the employee is competent to perform that task.

 Task Tom Jane Mary Paddy Lynda Painting 1 1 1 1 1 Tiling 1 1 1 1 1 Plastering 1 1 1 1 1 Digging 1 0 1 1 1 Mechanic 1 1 1 0 1 Detective 1 1 1 1 1 Engineering 1 1 0 1 1 Boxer 1 0 1 1 1 Chef 1 1 1 1 1 Gardener 1 1 0 1 1 Banker 1 1 1 1 0

From these two tables, one may want to generate another table showing which employees can be assigned to which project (only those employees should be assigned to a project who can complete all tasks).  So the ideal solution is to create another column (8th column) in the Project matrix table above which should have a drop down (Data > Data Validation) for every project showing which employees are competent for that project.

Here's an illustration:

Assuming that the Project matrix is in range A1:G4 (headers are in row 1)

1. In cell H2 (for Project1), the drop down should show Jane, Lynda, Paddy and Tom.  Mary should not appear there because she cannot perform one of the 3 tasks required to complete the project i.e. Gardener.
2. In cell H3 (for Project2), the drop down should show Lynda, Paddy and Tom.  Jane and Mary should not appear there because they cannot perform the Digging and Engineering tasks respectively.

The solution is dynamic for the following:

1. Projects added to the Project matrix Table; and
2. Tasks added (upto 6 only) or edited in the Project matric Table; and
3. Employees added to the Competency matrix Table; and
4. Tasks added to the Competency matrix Table

I have solved this problem by using:

1. Power Query; and
2. Formulas in Data > Data Validation.

You may download my solution workbook from here or here.

The objective is to assist a Store Manager with computing potential sales across different products and colours.  To start with let's assume two datasets:

1. Customer-Colour dataset - a two column table which lists down the colour preference of each customer; and
2. Colour-Product-Price dataset - a three column table which lists down the multiple products associated with each colour and the associated prices

My objective is to compute the "Potential Sales" if each customer buys one unit of each colour of each product.  While this can be resolved by using formulas (see Formula solution worksheet of the workbook), I'd like to resolve this problem with the help of the Power Query and PowerPivot tools.

The initial challenge will be to establish a relationship between the two tables because one cannot establish Many to Many relationships in a PowerPivot.  The two tables above are a perfect example of Many to Many relationships because each customer likes multiple colours and each colour is associated with multiple products.

You may download my solution workbook from this link.

You may also view a short video of my solution here:

For a lot of Excel users, a common problem is to extract City, State and Pin Code from an address string.  What compounds the issue is that there is no standardisation in an address string.  For e.g., one may end the address with a Pin code, while others may end it with a State and Country.  Some other variations could be:

1. End the address with Contact Numbers
2. There are no spaces between City, State and Pin code

There are of course many other variations which are possible and the sheer number of these variations makes it difficult to list down all of them.

Here is an example of address strings from where the State, City and Pin code need to extracted in three columns:

1. F-45, Pocket 1, Sector 198, Noida - 201303, Uttar Pradesh, India
2. S-45/B, Pocket 1-C, Phase II, Sector 189, Gurgaon, Haryana - 122002
3. RR-45/B, Pocket II-S, Cross 2, Noida - 201303, India
4. T-45, Sector 198, Lucknow - 226001, Uttar Pradesh, India
5. V-45(A), Sector 193C,Allahabad-211002 Uttar Pradesh India
6. V-45(A), Sector 193C,Allahabad-211002Uttar PradeshIndia

You may refer to my solution in this workbook.

Assume a five column dataset with the Fund Name in column A, Market Value in column B, Gain/Loss in column C, Gain/Loss % in column D [column C/column B] and Type of investment in column E (Short term or Long term).  Since multiple investments can be made in one fund, there can be multiple line items for with the same Fund Name (with different market values).  Let's call this dataset the Investment schedule.

The objective is to determine the following:

a. Minimum total gain realised from sale of one/multiple funds; and
b. Short term gain present in a above; and
c. Total Long term gain realised from sale of one/multiple funds; and
d. If Long term investments fall for liquidating the particular fund, then how much is the gain/loss by forcible liquidating the short term investments.

You may refer to my solution in this workbook.

Assume a table which lists attendees for a Company's Annual day function.  In this Table, data for every attendee is shown on a separate row so if an employee attends the function with his/her spouse and three children, then there will be 5 rows for that employee.

The question is to determine the count of the following family configuration:

1. Employees only (those who attended without spouse and children); and
2. Employees, spouse and children (Family); and
3. Employees and spouse (no children); and
4. Employees and children (no spouse)

You may refer to my solution in this workbook.  I have solved this problem using:

1. MS Excel Formulas based on Set Theory and Venn Diagram; and
2. PowerPivot