Perform a lookup with inexact text strings and/or spelling mistakes

{ 31 Comments }

Any lookup function i.e., VLOOKUP(), LOOKUP(), INDEX()/MATCH(), HLOOKUP() requires the text lookup variable (the first input of the VLOOKUP(), HLOOKUP() etc. ) to be spelled exactly the same as the entries in the lookup column.  If the spelling does not match, the result is #NA.  To some extent, inexact searches can be performed by concatenating wildcard characters (? and *) to the lookup variable.

The task is to increase MS Excel’s tolerance towards spelling mistakes and inexact text matches across two tables. For e.g., when performing a lookup, somehow MS Excel should be able to understand the Delhi City is the same as City, Delhi.  In order to perform, inexact searches, you can download and install Microsoft’s Fuzzy lookup add-in from here.

I have run this add-in on two live datasets in this workbook.

Leave a Comment Cancel reply

Your email address will not be published.

*

  • Hi

    i have been trying to find a solution to a Fuzzy Lookup problem for a while…

    I have this file

    https://drive.google.com/file/d/0B657UzYF_VoMMklaU3ZaMHR0S1k/edit?usp=sharing

    What i need is an exact match on column one, and then an approximate match on column two…

    As you can see, it is currently performing a best match on column two, which leaves some instances where column one is not exact.

    This has to be exact.

    Does anyone know of a way to do this?

    Many Thanks

    Euan

  • Thank you.

    The column that has to be an exact match is column 1 (columns ‘M’ and ‘O’) in excel.

    The job title matches (‘N’ and ‘P’) are matched well, but they have to be within column 1 parameters.

    I hope this makes sense?

    I could send you an annotated file if you wish?

    Many Thanks

    Euan

  • Hi,

    so if we take cell M3 (E1951, procurement and logistics officer); The result that the lookup has given is in O3 (E1958 , Procurement and logistics officer).

    The job title is a good match.

    But, I need it to find a match within the E1951 list of job titles. Whether the match is as good or not. The Codes have to match…So column M and O should always match…

    Thanks

    Euan

  • If you look at the raw data.

    the match i would have chosen instead would have been

    E1951, procurement and logistics officer

    becomes

    E1951A1-4_1.4 Procurement Specialist A34692

    • In which cell? Why can’t you elaborate. Sharing scanty information such as this will only make me ignore this problem. Just share two tables with me – the data table and the lookup table. Then for a few cases, please show in the third column of the data table the result which you are expecting from the lookup table.

    • Hi,

      Yes. Your question is clear now – thank you for that effort. I tried a fair bit but could not get this to work – perhaps the add-in is not working as intended for the sample dataset. Fuzzy add-in is a research project by Microsoft Business Intelligence (BI) Labs (See this link).

      Please share your dataset and feedback with them at this e-mail address – dcrt@microsoft.com. May be someone from that team contacts you with a resolution.

        • As an alternate approach, the add-in I created and use (just click on my name) allows you to write a formula to try and accomplish what you need. In particular, the fuzzy version of the vlookup (called pwrVLOOKUP in my add-in) lets you work with fuzzy range lookups. The formula I used is below (and it assumes your column1 is sorted).

          =pwrVLOOKUP(Table20[@Column2],INDIRECT(“B”&MATCH(Table20[@Column1],Table19[Column1],0)+3&”:C”&MATCH(Table20[@Column1],Table19[Column1],1)+3),1,40)

          It uses the MATCH function to get the first/last occurrence of your hard constraint and the INDIRECT function to pass that as a range (you can see the string reference being built in the formula).

          The degree of fuzziness is controlled by the last parameter (closer to 0 is fuzzier, and closer to 100 is more exact).

          I pasted a few rows from my test below. The alignment may be off, but you can get the gist of the results.

          Value being sought Value Matched (if any)
          E1951: ALAT Team Coordinator #N/A
          E1951: Procurement and Logistics Officer A1-4_1.4 Procurement Specialist
          E1951: IFT Advisor Africa and Asia #N/A
          E1951: Head of Procurement and Logistics A1-4_1.4 Procurement Specialist
          E1958: International Finance Manager FV – H2_Training International
          E1958: Procurement and Logistics Officer SI2 – A14_Procurement and Logistics Officer

          Charlie

  • Hi Euan and Ashish,

    The Microsoft Fuzzy Lookup Add-In does not presently allow you to specify exact match constraints. Internally the software has support for it and I was hoping there might be a way to enable it through the advanced configuration options, but there is a minor issue that is preventing it from working. Hopefully it will be fixed in future versions.

    Regards,
    -Kris

  • Ashish,

    I figured out the Tables and wanted to let you know the FuzzyLookup is great! Works great! Thank you!

  • Hi,

    I´ve just installed Fuzzy Lookup, and trying to figure it out.
    When I hit “Go” I only get 10 matches and the following error message:

    ===================================

    Unable to set the NumberFormat property of the Range class (Microsoft Excel)

    ——————————
    Program Location:

    at System.Dynamic.ComRuntimeHelpers.CheckThrowException(Int32 hresult, ExcepInfo& excepInfo, UInt32 argErr, String message)
    at CallSite.Target(Closure , CallSite , ComObject , String )
    at System.Dynamic.UpdateDelegates.UpdateAndExecute2[T0,T1,TRet](CallSite site, T0 arg0, T1 arg1)
    at CallSite.Target(Closure , CallSite , Object , String )
    at System.Dynamic.UpdateDelegates.UpdateAndExecute2[T0,T1,TRet](CallSite site, T0 arg0, T1 arg1)
    at FuzzyLookupAddInForExcel.FuzzyJoinControl.MoveNextRow(Int32 outputBatchSize, Int32 r0, Int32 c0, Int32& r, Int32 similarityColumnIndex, Int32 similarityXmlColumnIndex, Int32 outputColumnCount, Worksheet activeWorksheet, Range& outputRange, Array& outputArray, Int32& outputRowCount)
    at FuzzyLookupAddInForExcel.FuzzyJoinControl.FuzzyJoin(Worksheet worksheet)

    This happens both in the sample file from the download and other files.

    Anyone who knows how to fix this?

    Thanks!

  • Make sure your tables are defined only where there is data – that happened to me the first time because i made my table by highlighting from the top of each column which in turn made a “bottomless” table.

      • The file I´m using is the portfolio file that comes in the zip download folder. So there shouldn´t be any problems with that one. The tables were already defined when I downloaded them.

        I´m using excel 2010 and Windows 7 prof. 64-bits.

        • Hi,

          Let’s try to solve this in parts:

          1. Reconvert the Table into a Range
          2. Open the file on some other computer (preferably a computer with the same software installed as yours)
          3. Reconvert the Range into a Table

          Does it work fine now?

          Download the file from my Blog article and see if it works fine there.