Category: DATA EXTRACTION

Generate a list of all Excel files from a specific folder without using VBA

{54 Comments}

Assume a folder named Test on the desktop.  In this Test folder, there is another folder named Survey reports.  In the Survey reports folder, there are Word documents, PowerPoint files, Excel files, PDF files and all sort of other files types.

In a new workbook, one may want to list down all Excel files from the Survey reports folder (saved on the desktop) without using VBA.  Furthermore, for new Excel files added to the folder, the list should keep updating.

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).  In Power Query, there is a feature to append data from multiple workbooks into one running range.  The detailed process to do so is described below:

1. Go to Power Query > From File > From Folder
2. Choose the Test folder on the desktop and click on OK
3. Filter the extension column on "Contains" .xls
4. Right click on the Name column heading and choose "Remove Other columns"
5. Click on Apply and Close

This should now list down all MS Excel files lying the Test folder.  This is a dynamic list i.e. add or remove Excel files from the folder and just right click on any cell and select Refresh.

Solution for all versions of MS Excel

Try the following procedure to get a list of all Excel files from the folder:

1. In the new workbook, press Ctrl+F3 > New.  In the name box, type FL
2. In the Refer to box, enter =FILES("C:\Users\Ashish\Desktop\Test\Survey Reports\*.xls")&T(NOW()).
3. Click on OK and Close

In cell A3 of sheet1, enter the following formula and copy down

=IF(ISERROR(INDEX(FL,ROW()-2)),"",INDEX(FL,ROW()-2))

If you add another Excel file to the Survey report folder, just press F9 anywhere on sheet1 and the file will appear in the list.

Change C:\Users\Ashish\Desktop\Test\Survey Reports\ to your actual path from where you want to extract the Excel files.

To generate a list of all tab names from in an Excel file, you may refer to the following post.

Generate a list of all tabs names without using VBA

{245 Comments}

Assume an MS Excel file has 4 worksheets - Sheet1, Sheet2, Sheet3 and Sheet4.  Insert a sheet before Sheet1 and name that tab as Summary.  On the Summary tab, one may want to generate a list of all sheet names from cell C7 onwards.  Furthermore, the sheet names so generated, should be dynamic for the following changes:

1. Sheets added
2. Sheets deleted
3. Sheets renamed
4. Sheets repositioned

While this can be accomplished by using VBA, you may refer to my formula based solution here.

To generate a list of all Excel files in a specific folder, you may refer to the following post.

Extract farthest/latest date based on multiple conditions

{0 Comments}

Assume a three column database showing Site ID, Customer, Status and Requested Date.  On the same site ID, the same customer may have different status on different dates.   In such a scenario, one may want to know the farthest/latest requested date and its corresponding status for all unique combinations of Site ID and Customer.

I initially attempted to solve this problem by using a pivot table but the pivot output was incorrect.  The pivot was returning the farthest/latest date for all status' of a particular Site ID and Customer.  Ideally, it should show only the farthest/latest date and its corresponding status for a particular Site ID and Customer.  Therefore, for a particular Site ID and Customer combination, only one row should show up in the final output.  Therefore, the Pivot Table solution did not work.

You may download this workbook for a better description of the problem and my workaround.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table)

Solution for all versions of MS Excel

You may refer my Advanced Filter (along with formula) solution in the workbook.

Dynamically transpose data after ignoring blank cells

{0 Comments}

In range A7:A15, assume the following values A, S , empty cell, empty cell, R, A, W, A, SD.  Starting from cell B17, transpose data to the right (ignoring the blank cells).  Therefore, in range B17:F17, the result should be A, S, R, W SD.

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).

You may watch a short video of my solution here

Solution for all versions of MS Excel

You may refer to my array formula solution.

You may refer to both solutions in this workbook.

Extract number from an alphanumeric string

{72 Comments}

Given an alphanumeric string, one may want to perform the following

Extract phone number from the string

Assume a list of customer addresses with multiple phones numbers mentioned in the address field itself.  These numbers may be mobile numbers and/or mobile numbers.  Furthermore, PIN codes may also be mentioned in the address string.

One may want to extract only the phone numbers to another column.

You may refer to the my solution in this workbook.

Extract one specific 20 digit number from the string

Assume cell descriptions which contains two 20 digit numbers occurring anywhere in the string.  Once may want to extract only that 20 digit number which has the word New before it.

You may refer to my solution in this workbook.

Programmatically transfer data from master sheet to sub sheets with conditions

{43 Comments}

A user inputs raw material purchase data in a master sheet.  This data is entered by date and vendor.  One may want to view individual vendor sheets at a monthly level.  In the attached file, i have presented two solutions - a macro based one and a formula based one.

The data layout and instructions are very clearly mentioned in the this workbook.

You may also refer to the a similar article at the following link.

Create a master list of unique account codes from two data sources

{2 Comments}

Assume a two column database in range C4:D11.  In cell C3, type "Cheque No." (without double quotes) and in cell D3, type "As per Bank Account" (without double quotes).  Entries in range C4:C11 are cheque numbers and in range D4:D11 are Dollar values.  Let's refer to this database as "The Bank Dump".

Assume another two column database in range I4:J9.  In cell I3, type "Cheque No." (without double quotes) and in cell D3, type "As per SAP" (without double quotes).  Entries in range I4:I9 are cheque numbers and in range J4:J9 are Dollar values.  Let's refer to this database as "The SAP Dump".

Please note that cheque numbers in range C4:C11 may or may not be present in range I4:I9.  Likewise cheque numbers in range I4:I9  may or may not be present in range C4:C11

One may want to create a reconciliation statement i.e. a three column database showing "Cheque No." in the first column, "As per Bank Account" in the second column and "As per SAP" in the third column.

Herein the first challenge is to create a master list of Account codes from two ranges, C4:C11 and I4:I9 and then the corresponding amounts from the two databases.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).

You may refer to my solution in this workbook.

Solution for all versions of MS Excel

One may use using the Data > Consolidation feature.  Please try the following steps:

1. Select cell C15 and go to Data > Consolidate
2. Under Function, choose SUM
3. Under reference, select range C3:D11 and the click on Add
4. Under reference, select range I3:J9 and the click on Add
5. Check the boxes for Top row and Left column
6. Click on OK

The result should be the three column database as desired above.

Remove special characters and numbers from an alphanumeric string

{4 Comments}

Hi,

Assume the following alphanumeric strings in range A3:A6

A-2222 Mohan 12-4 #$% Singh$%&***
**** Sanchit #$% Pal$%&***
$^%$^$ Sudhir #$% Si$$$ngh$%&***
#@234 Krishan #$% Oman$%&***

In range B3:B6, one may want to extract only the text values i.e. drop special characters and numbers from the values in range A3:A6.  In range B3:B6, the result should be:

A Mohan Singh
Sanchit Pal
Sudhir Singh
Krishan Oman

For solving this problem, you will have to download and install the morefunc addin from here.  For instructions of installing the addin for Excel 2010, please refer point 4 of the following post.

You may refer to my solution in this workbook.

To remove only special characters appearing at the beginning and end, you may refer to my solution at the following link.

Performing a text to rows operation

{19 Comments}

Given multiple text values in a cell which are separated by a semi-colon, one may want to split data into multiple rows along with existing data in other columns.  This is something similar to Data > Text to columns - the only difference being that data has to be split by rows instead of columns.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).

You may watch a short video of my solution here

Solution for all versions of MS Excel

I have also shown a macro based solution to this problem.

You may refer to my solution in this workbook.

For accomplishing just the reverse i.e. merging data from multiple cells into a single cell (separated by a comma), you may refer to my solution at this link.