# Tags: LOOKUP

Consider this simple two column table showing text entries in column A and the corresponding numbers in column.  Assume this data is in range A2:B11 (headings are in A1:B1).

 text Value A 1 B 2 C 3 D 4 E 5 F 6 G 7 H 8 I 9 J 0

The objective is to generate the numeric code for text code of any length entered in a certain cell.  For example, a user will type a certain text code, say ABEJ and the expected result should be 1250.  For JABF, the result should be 0126.  The text entry and text length are both user determined.

With ABEJ, typed in cell D2, enter this array formula in cell E2

=TEXT(SUMPRODUCT((LOOKUP(MID(D2,ROW(INDIRECT("1:"&LEN(D2))),1),\$A\$2:\$A\$11,\$B\$2:\$B\$11))*((10^(LEN(D2)-1-(ROW(INDIRECT("1:"&LEN(D2)))-1))))),REPT("0",LEN(D2)))

This formula can now be copied down for generating the numeric code for all text codes entered in column D.

Hi,

Assume a column of names as follows:

Name
Mohammed Zia-Ul Haque
Steven Thomas -
,-Rohit Sunil Ahir-Chowdhary.-
Anuj -----------
Sameer --
..,Mohit --
Rajeev Nair.
Monalisa . Das
Vijeta ...
--,.Anjana. M.U..,-

Please observe that there are special characters before the name, within the name and after the name.  The task is to remove special characters before and after the name.  The expected result is shown below:

Expected Result
Mohammed Zia-Ul Haque
Steven Thomas
Rohit Sunil Ahir-Chowdhary
Anuj
Sameer
Mohit
Rajeev Nair
Monalisa . Das
Vijeta
Anjana. M.U

The array formula (Ctrl+Shift+Enter) to make this work is

=MID(A2,MIN(SEARCH(CHAR(ROW(\$A\$65:\$A\$90)),A2&CHAR(ROW(\$A\$65:\$A\$90)))),LOOKUP(2,1/((CODE(MID(UPPER(A2),ROW(INDIRECT("1:"&LEN(A2))),1))>=65)*(CODE(MID(UPPER(A2),ROW(INDIRECT("1:"&LEN(A2))),1))<=90)),ROW(INDIRECT("1:"&LEN(A2))))-(MIN(SEARCH(CHAR(ROW(\$A\$65:\$A\$90)),A2&CHAR(ROW(\$A\$65:\$A\$90)))))+1)

I have solved a similar problem at this link as well but that requires the usage of an add-in.  This is so because the special characters and numbers need to be removed from within the string as well.  In other words, everything except letters need to be removed from the alphanumeric string (no matter where the numbers and special characters are - beginning, middle or at the end).

The objective is to assist a Store Manager with computing potential sales across different products and colours.  To start with let's assume two datasets:

1. Customer-Colour dataset - a two column table which lists down the colour preference of each customer; and
2. Colour-Product-Price dataset - a three column table which lists down the multiple products associated with each colour and the associated prices My objective is to compute the "Potential Sales" if each customer buys one unit of each colour of each product.  While this can be resolved by using formulas (see Formula solution worksheet of the workbook), I'd like to resolve this problem with the help of the Power Query and PowerPivot tools.

The initial challenge will be to establish a relationship between the two tables because one cannot establish Many to Many relationships in a PowerPivot.  The two tables above are a perfect example of Many to Many relationships because each customer likes multiple colours and each colour is associated with multiple products.

You may also view a short video of my solution here:

For a lot of Excel users, a common problem is to extract City, State and Pin Code from an address string.  What compounds the issue is that there is no standardisation in an address string.  For e.g., one may end the address with a Pin code, while others may end it with a State and Country.  Some other variations could be:

1. End the address with Contact Numbers
2. There are no spaces between City, State and Pin code

There are of course many other variations which are possible and the sheer number of these variations makes it difficult to list down all of them.

Here is an example of address strings from where the State, City and Pin code need to extracted in three columns:

1. F-45, Pocket 1, Sector 198, Noida - 201303, Uttar Pradesh, India
2. S-45/B, Pocket 1-C, Phase II, Sector 189, Gurgaon, Haryana - 122002
3. RR-45/B, Pocket II-S, Cross 2, Noida - 201303, India
4. T-45, Sector 198, Lucknow - 226001, Uttar Pradesh, India

You may refer to my solution in this workbook.

Assume there is a list of names appearing in range A2:A9.  Before these names, there are abbreviations such as Dr., Mr. Mrs. etc.  To worsen the situation, there may/may not be a space after the abbreviation.  A sample dataset may look like this:

Mr.Ramakrishna
Mr Ramakrishna
Mr. Ramakrishna
MrRamakrishna
PhD.Saurav
Dr (Mrs.) Indu Sharma
Dr. (Mr) Rakesh Singh
Mr. Mrinal Joshi

The task is to extract only the names in another column.

You may refer to my solution in this workbook.

A tournament has 18 participating teams with 25 players in each team.  Each team has to play five rounds of the Tournament and not all players play all rounds.  Scores earned by each player in each round are shown in individual cells.  If a player does not play a round, that cell is left empty.

The task is to sum the highest 18 scoring players for each round.  Only the highest 18 players per team count towards the teams score.  If few of the players have the same score at position 18 then only one of them should be included in the overall score.

One solution is to sort each round of scores for each team in descending order and sum the highest 18 values.  This is obviously a time consuming process.

There could be two others ways to solve this problem

Formula driven solution - This uses a spare column, a lengthy formula and the Data > Table functionality.  Since Data > Table is a series of array formula, this solution makes the workbook very sluggish.  You may refer to my solution in this workbook.

PowerPivot solution - This solution is far better than the formula driven one in as much as no spare columns, lengthy formulas or Data > Tables have been used.  The solution in this workbook adds the scores of the highest 18 scoring players per round (If few of the players have the same score at position 18 then all scores are included in the overall score).  To use this PowerPivot solution, you need to be using the PowerPivot add-in for MS Excel.  This add-in is only available for Excel 2010 and higher versions.

Assume there is data for complaints received and complaints resolved for two years - 2009 and 2010 for some regions regions.  Each Region has its own worksheet.  There are three headings on each sheet - Particulars, 2009 and 2010.  While the Particulars column has the text "Complaints received" and "Complaints resolved", the year columns have number of complaints received and resolved.

The task is to create a Summary sheet which:

1. Show the consolidated figure of complaints received and complaints resolved for both years
2. Show the individual regions which make up the consolidated figure in 1 above.

In this workbook, you may see the data layout and four different ways of resolving the two tasks mentioned above.

You may also refer to the following related posts:

A number is entered in cell A1 of every sheet in a multi sheet workbook.  The task is to sum numbers in cell C2 of the Summary sheet from cell A1 of last x sheets - x is the number of sheets (mentioned in cell B2 of the Summary sheet) from which numbers have to be summed up.  So if the number entered in cell B2 of sheet is 3, then sum numbers from cell A1 of last three sheets.  Worksheets can get added, removed or renamed.

You may refer to my solution in this workbook.

When downloading data from an ERP into Excel, a Remarks column has entries which can be as lengthy as 300 characters per cell.  Before uploading this data into another database, entries in the Remarks column need to be split into multiple cells of upto 50 characters per cell.  Furthermore, when splitting data into multiple cells, it should be ensured that no word gets split/broken.  So if the 50th character is alphabet k of the word Like, then that cell should hold words till the word which appears just before Like so that the length does not exceed 50 characters.

You may refer to my solution in the attached workbook.

Assume a two column database in range A3:B6.  Headings are in A2:B2.  In range A3:A6 are numbers, say 60,30,20 and 10.  In range B3:B6 are some names, say Ashish, Sameer, Gautam and Sanjay.

In cell A7, enter any name, say Sameer.  The task is to extract the corresponding number from the range (A3:B6) into cell B7.  Needless to say that VLOOKUP() will not work here because the lookup_value i.e. Sameer is not in the first column of the table_array i.e. in range A3:A6.  To get the answer, one may write the following INDEX() and MATCH() combination of functions in cell B7

=INDEX(\$A\$3:\$A\$6,MATCH(\$A7,\$B\$3:\$B\$6,0),1)

So with Sameer in cell A7, the result in cell B7 will be 30.  One may further wrap this arround the IFERROR() function to trap errors.

One may also try the LOOKUP function in cell C7:

=LOOKUP(A7,\$B\$3:\$B\$6,\$A\$3:\$A\$6)

The answer would be 30 which is correct.  However, if you type Gautam in cell A7, the result will change to 60 in cell C7 (which is wrong!!!).  The answer in cell B7 will be correct i.e. 20.

The reason for the LOOKUP() function anomaly is that range B3:B6 is not sorted in ascending order.  The Help menu on the LOOKUP() function states that the lookup_vector of the LOOKUP() function should be sorted in ascending order.  Therefore, if you now select A2:B6 and sort the name column in ascending order, the result in cell C7 will become 20.

So one difference between the INDEX(), MATCH() combination of funtions and LOOKUP() function is that for the LOOKUP() function to work, the lookup_vector should be sorted in ascending order.  There is no requirement for sorting when using the INDEX() and MATCH() combination of functions.

Now type Samir (not Sameer) in cell A8.  When you copy B7:C7 to cell B8, the result in cell B8 will be #N/A and in C8 will be 30.  The result should ideally be #N/A because there is no Samir in range B3:B8.  Surprisingly, even though the name is misspelt, the LOOKUP returns a result (not an error).  This is happening because if the LOOKUP() function does not find an exact match, it finds for the largest value less than the lookup_value (provided the lookup_vector is sorted in ascending order).  The largest value less than Samir in range B3:B6 is Sameer.  Likewise, if you enter Ashwin in cell A9 and copy B8:C8 to cell B9, the result will be #N/A and 60.

The names entered in range A8:A9 would suggest that the LOOKUP() function can take care of spelling mistakes in the lookup_value.  Well this is not true because if you enter Asheesh as the lookup_value in cell A10, the result will be #N/A in B10:C10.  This is because since it cannot find Asheesh in range B3:B6, it searches for values less than Asheesh which do not Exist - Ashish > Asheesh.

From the discussion above, one may therefore conclude that the LOOKUP() function is not the appropriate function to use for exact matches especially when dealing with text lookups.

However, with a small tweak in the LOOKUP() function, one can coerce an exact search:

=LOOKUP(2,1/(\$B\$3:\$B\$6=\$A7),\$A\$3:\$A\$6)

If you try this formula in cell D7 and copy down, the result will be the same as the INDEX() and MATCH combination of functions which you wrote in B7:B10.