Tags: MATCH

Performing an iterative lookup to return closest match

{5 Comments}

Suppose a worksheet name is Code_details.  The following values lie in range A2:A14 of this sheet:

A12,AS478,QW447,EQ46,RYT9985,VCX147,BNM159,ASQ478,VC325,ASW675,A123,
ASDFG1234567890 and ASDFG123456789

Suppose there is another worksheet named Summary where the following values lie in range B5:B7:

ASQ47809876,ASW675458 and QWERT

Now one may want to lookup ASQ47809876 in A2:A14 of Code_details sheet.  If this exact value is not found there, then the lookup value should be trimmed by one digit/character from the extreme right i.e. the lookup value should now become ASQ4780987.  The search should carry on till an exact match is found.

You may refer to my solution in this workbook.

Return an exact value via the LOOKUP() function

{8 Comments}

Assume a two column database in range A3:B6.  Headings are in A2:B2.  In range A3:A6 are numbers, say 60,30,20 and 10.  In range B3:B6 are some names, say Ashish, Sameer, Gautam and Sanjay.

In cell A7, enter any name, say Sameer.  The task is to extract the corresponding number from the range (A3:B6) into cell B7.  Needless to say that VLOOKUP() will not work here because the lookup_value i.e. Sameer is not in the first column of the table_array i.e. in range A3:A6.  To get the answer, one may write the following INDEX() and MATCH() combination of functions in cell B7

=INDEX($A$3:$A$6,MATCH($A7,$B$3:$B$6,0),1)

So with Sameer in cell A7, the result in cell B7 will be 30.  One may further wrap this arround the IFERROR() function to trap errors.

One may also try the LOOKUP function in cell C7:

=LOOKUP(A7,$B$3:$B$6,$A$3:$A$6)

The answer would be 30 which is correct.  However, if you type Gautam in cell A7, the result will change to 60 in cell C7 (which is wrong!!!).  The answer in cell B7 will be correct i.e. 20.

The reason for the LOOKUP() function anomaly is that range B3:B6 is not sorted in ascending order.  The Help menu on the LOOKUP() function states that the lookup_vector of the LOOKUP() function should be sorted in ascending order.  Therefore, if you now select A2:B6 and sort the name column in ascending order, the result in cell C7 will become 20.

So one difference between the INDEX(), MATCH() combination of funtions and LOOKUP() function is that for the LOOKUP() function to work, the lookup_vector should be sorted in ascending order.  There is no requirement for sorting when using the INDEX() and MATCH() combination of functions.

Now type Samir (not Sameer) in cell A8.  When you copy B7:C7 to cell B8, the result in cell B8 will be #N/A and in C8 will be 30.  The result should ideally be #N/A because there is no Samir in range B3:B8.  Surprisingly, even though the name is misspelt, the LOOKUP returns a result (not an error).  This is happening because if the LOOKUP() function does not find an exact match, it finds for the largest value less than the lookup_value (provided the lookup_vector is sorted in ascending order).  The largest value less than Samir in range B3:B6 is Sameer.  Likewise, if you enter Ashwin in cell A9 and copy B8:C8 to cell B9, the result will be #N/A and 60.

The names entered in range A8:A9 would suggest that the LOOKUP() function can take care of spelling mistakes in the lookup_value.  Well this is not true because if you enter Asheesh as the lookup_value in cell A10, the result will be #N/A in B10:C10.  This is because since it cannot find Asheesh in range B3:B6, it searches for values less than Asheesh which do not Exist - Ashish > Asheesh.

From the discussion above, one may therefore conclude that the LOOKUP() function is not the appropriate function to use for exact matches especially when dealing with text lookups.

However, with a small tweak in the LOOKUP() function, one can coerce an exact search:

=LOOKUP(2,1/($B$3:$B$6=$A7),$A$3:$A$6)

If you try this formula in cell D7 and copy down, the result will be the same as the INDEX() and MATCH combination of functions which you wrote in B7:B10.

Apportion a number over empty cells

{10 Comments}

Assume a matrix like data layout where Site's are mentioned in range C2:C15 and first day of every month from January 2011 till April 2012 are mentioned in range D1:S1.  In range D2:S15, the numbers represent the value of the electricity bill received in that month.  Some cells can be blank representing bills not received in that month.

One may want to apportion the value of bills received equally over months for which bills were not received.  The assumption in doing so is that if a cell in empty i.e. bill not received in the month, then it will be included in the next bill.  The problem also is that there can be a variable number of empty cells between various bills received.

You may refer to the "Result" sheet in the this workbook.

Shade alternate band of rows in a filtered range

{2 Comments}

Assume the following numbers in range C5:C22.  Heading is in cell C4.

Range C5:C7 - 11
Range C8:C11 - 14
Range C12:C13 - 23
Range C14-C19 - 56
Range C20-C22 - 78

One may want to colour cells for every change in number in range C5:C22 with the following two modifications:

1. Shading should be for alternate band of rows i.e. C5:C7 should be coloured, C8:C11 should not be coloured, C12:C13 should be coloured, C14:C19 should not be coloured and C20:C22 should be coloured.

2. Shading should vary with the range being filtered i.e. If range C5:C22 is filtered to exclude 14 and 56, then numbers 11 and 78 should be coloured i.e. range C5:C7 and C20:C22 should be coloured.

To accomplish this, try the following:

a. Select cell C5 and go to Home > Format > Conditional formatting > New Rule > Use a formula to determine which cells to format

b. In the formula box there, enter the following formula

=AND(ISODD(COUNT(1/FREQUENCY(IF(SUBTOTAL(3,OFFSET(C$5,ROW(C$5:C5)-ROW(C$5),)),MATCH(C$5:C5,C$5:C5,0)),ROW(C$5:C5)-ROW(C$5)))),COUNT(1/FREQUENCY(IF(SUBTOTAL(3,OFFSET(C$5,ROW($C$5:$C$22)-ROW(C$5),)),MATCH($C$5:$C$22,$C$5:$C$22,0)),ROW($C$5:$C$22)-ROW(C$5)))>1)

c. Click on Format and select any Fill colour

d. Click on OK > OK > Apply

e. Copy cell C5, select range C6:C22 and right-click > Paste Special > Formats

With no filter in place, colouring will happen as mentioned in point 1 above.  Auto filter the range now on some values and the colour banding should change.

Create charts on different sheets by clicking a button

{6 Comments}

Assume a five column database with the first column containing codes, second one containing a separator, third one containing some alpha values, fourth one is a combination of the first three columns and the last column has response times.

One may want to create as many charts as there are codes in column A.  The charts should get created on different sheets with the click of a button.

You may refer to my solution in this workbook.

Calculate turn around time excluding Sundays and public holidays

{206 Comments}

Assume a two column database showing starting date/time and ending data/time (Data/time stamp appear in a single cell).  Given a list of public holidays in a year and starting and ending work times, one may want to know the turn around time excluding Sundays and public holidays.

You may refer to my solution in this workbook.

Extract n th. most frequently occurring item from a database

{2 Comments}

Assume a database in range D5:D11 with the following values - Black, Black, Purple, Purple, Grey, Grey, Grey.  To extract the most frequently occurring item, one may use the following formula

=INDEX($D$5:$D$11,MODE(MATCH($D$5:$D$11,$D$5:$D$11,0)))

To extract the second most frequently occurring item, try the following:

1. Enter 2 in cell C16:
2. In cell D16, enter the following array formula (Ctrl+Shift+Enter)

=LOOKUP(2,1/(LARGE(FREQUENCY(IF(INDEX(COUNTIF($D$5:$D$10,$D$5:$D$10),0),MATCH($D$5:$D$10,$D$5:$D$10,0)),ROW($D$5:$D$10)-ROW($D$4))+(ROW($D$5:$D$11)/10000),C16)=FREQUENCY(IF(INDEX(COUNTIF($D$5:$D$10,$D$5:$D$10),0),MATCH($D$5:$D$10,$D$5:$D$10,0)),ROW($D$5:$D$10)-ROW($D$4))+(ROW($D$5:$D$11)/10000)),$D$5:$D$10)

To extract the n th most frequently occurring item, enter that n th number in cell C16.

Summarise data from multiple sheets with one condition – PartII

{2 Comments}

Assume a worksheet which has scores (1-10) on four future dates on nine questions.  Each row represents responses for one question.  Likewise there are 20 respondents i.e. 20 worksheets which have the same structure.

One may want to summarise data from all 20 worksheets into one worksheet.  The question and solution are more clearly explained in the attached file.

I have presented two cases here - Example 1 and Example 2 in this workbook.

You may also want to refer to the following related posts:

1. Summarise data from multiple sheets with one condition
2. Summarise data from multiple sheets with multiple conditions
3. Generate a list of all tab names without using VBA

Generate a list of all tabs names without using VBA

{249 Comments}

Assume an MS Excel file has 4 worksheets - Sheet1, Sheet2, Sheet3 and Sheet4.  Insert a sheet before Sheet1 and name that tab as Summary.  On the Summary tab, one may want to generate a list of all sheet names from cell C7 onwards.  Furthermore, the sheet names so generated, should be dynamic for the following changes:

1. Sheets added
2. Sheets deleted
3. Sheets renamed
4. Sheets repositioned

While this can be accomplished by using VBA, you may refer to my formula based solution here.

To generate a list of all Excel files in a specific folder, you may refer to the following post.

Remove duplicates after satisfying additional conditions

{3 Comments}

Assume a two column database of patient ID's and service availed.  One patient may avail the same service multiple times in a year due to which that record may appear as many times as the service is availed.  For e.g., if patient A001 avails the Radiology service twice, then A001 and Radiology will appear in two rows.

Once may want to create the following two reports from this database:

1. A list containing all those records where the patient availed just one service; and
2. A list containing all those records where the patient availed more than one service

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the PowerPivot add-in, then a calculated column formula can resolve this problem.

Solution for all versions of MS Excel

I have shared two solutions here:

1. Array formula and advanced filters; and
2. Only array formulas

For better understanding of the question and to view the final solution, please refer to this workbook.