Category: DUPLICATES AND UNIQUES

Dynamically extract unique values from a filtered range

{0 Comments}

Data in a two column database (A3:B13) can be Auto filtered on column A with one or many conditions.  Data in column B will be numbers only (positive, negative or 0's) - no text values.

After filtering data in column A, one may want to extract the unique numbers from column B to cell A18 (and then downwards).  Data from cell A18 downwards should keep changing with different filter conditions specified in the Auto filter drop down of column A.

You may refer to my solution in the this workbook.

To extract unique values from an unfiltered range without any conditions, you may refer to the following link

To extract unique values from an unfiltered range with multiple conditions, you may refer to the following link

Display text entries in the data area of a pivot table

{56 Comments}

Assume a two column database of approximately 20,000 rows (say from A1:B20237).  The first column has vendor names and the second column has Part codes (alphanumeric string).  Headings are in A1:B1, say Vendor in cell A1 and Part_code in cell B1.  One vendor supplies multiple parts and therefore there would be repetitions of vendor names in column A.

The task is to reorient this database to show all part numbers in column A and parts supplied by those vendors should appear in different columns of that row.  If one attempts to create a pivot table, the problem would be with the data area of the pivot table.  When one drags the part codes (alphanumeric string) column to the data area, count of part number would appear instead of the actual part number.  This is the very nature of a pivot table.  Since it is a summarisation tool, one can never see text entries in the data area of a pivot table.

You may download the workbook showing both solutions (using MS Query and Power Query) from here.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (minimal formulas) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).  You may refer to Solution2 - Power Query worksheet.

You may watch a short video of my solution here

Solution for all versions of MS Excel

You may refer to Solution1 - MS Query worksheet.  To reorient the data i.e. to view text entries in the data area of a pivot table, try the following approach:

1. In cell C1, type Count.  So the value in cell A1 is Vendor, in cell B1 is Part_code and in cell C1 is Count
2. In cell C2, enter =$B$1&" "&COUNTIF(A$2:A2,A2) and copy down.  Since the number of rows here are quite many, i would suggest copying in batches i.e. first copy down 5,000 rows, then another 5,000 rows and so on.  This is solely to prevent Excel from crashing and/or taking unduly long to process.
3. Select range A1:C20237 and press Ctrl+F3 > New.  In the name box, enter Dummy
4. Select range A1:C20237 and press Ctrl+T to convert to a Table.  This is to take care of data being added beyond row 20237.
5. Save the file and open a new worksheet (in the same workbook)
6. While on any cell in this new worksheet, go to Data > From Other Sources > From Microsoft Query > Excel Files > OK
7. Navigate to the folder where you saved the file in step 5 above, select the file and click on OK
8. With Dummy selected by default, press the > symbol.  This will show you three column headings in the right hand side box
9. Click on Next three times
10. Select the second option button and click on Finish
11. Click on the SQL button and delete all contents that you see there.  Type the following in the blank white space

TRANSFORM first(Part_code)
SELECT Vendor
FROM dummy
GROUP BY Vendor
PIVOT count

12. Click on OK
13. Go to Data > Return Data to MS Excel
14. In the Import Data box, select Table and in Existing sheet, select cell A1
15. Click on Finish

Data should now appear in the desired format.

For any changes to data in range A2:B20237, right-click on any cell in the output range and select Refresh.  Furthermore, since the range has been converted to a Table, you may even add further rows of data and simply right-click to Refresh the output data range.

To improve performance of your workbook, you may want to copy the formulas in column C and paste them as Values (Paste Special > Values).  Once again, do not paste special all values at once - do so in groups of 5,000 rows to prevent Excel from crashing and/or taking unduly long to process.

Shade alternate band of rows in a filtered range

{0 Comments}

Assume the following numbers in range C5:C22.  Heading is in cell C4.

Range C5:C7 - 11
Range C8:C11 - 14
Range C12:C13 - 23
Range C14-C19 - 56
Range C20-C22 - 78

One may want to colour cells for every change in number in range C5:C22 with the following two modifications:

1. Shading should be for alternate band of rows i.e. C5:C7 should be coloured, C8:C11 should not be coloured, C12:C13 should be coloured, C14:C19 should not be coloured and C20:C22 should be coloured.

2. Shading should vary with the range being filtered i.e. If range C5:C22 is filtered to exclude 14 and 56, then numbers 11 and 78 should be coloured i.e. range C5:C7 and C20:C22 should be coloured.

To accomplish this, try the following:

a. Select cell C5 and go to Home > Format > Conditional formatting > New Rule > Use a formula to determine which cells to format

b. In the formula box there, enter the following formula

=AND(ISODD(COUNT(1/FREQUENCY(IF(SUBTOTAL(3,OFFSET(C$5,ROW(C$5:C5)-ROW(C$5),)),MATCH(C$5:C5,C$5:C5,0)),ROW(C$5:C5)-ROW(C$5)))),COUNT(1/FREQUENCY(IF(SUBTOTAL(3,OFFSET(C$5,ROW($C$5:$C$22)-ROW(C$5),)),MATCH($C$5:$C$22,$C$5:$C$22,0)),ROW($C$5:$C$22)-ROW(C$5)))>1)

c. Click on Format and select any Fill colour

d. Click on OK > OK > Apply

e. Copy cell C5, select range C6:C22 and right-click > Paste Special > Formats

With no filter in place, colouring will happen as mentioned in point 1 above.  Auto filter the range now on some values and the colour banding should change.

Remove duplicates after satisfying additional conditions

{3 Comments}

Assume a two column database of patient ID's and service availed.  One patient may avail the same service multiple times in a year due to which that record may appear as many times as the service is availed.  For e.g., if patient A001 avails the Radiology service twice, then A001 and Radiology will appear in two rows.

Once may want to create the following two reports from this database:

1. A list containing all those records where the patient availed just one service; and
2. A list containing all those records where the patient availed more than one service

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the PowerPivot add-in, then a calculated column formula can resolve this problem.

Solution for all versions of MS Excel

I have shared two solutions here:

1. Array formula and advanced filters; and
2. Only array formulas

For better understanding of the question and to view the final solution, please refer to this workbook.

Dynamically transpose data after ignoring blank cells

{0 Comments}

In range A7:A15, assume the following values A, S , empty cell, empty cell, R, A, W, A, SD.  Starting from cell B17, transpose data to the right (ignoring the blank cells).  Therefore, in range B17:F17, the result should be A, S, R, W SD.

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).

You may watch a short video of my solution here

Solution for all versions of MS Excel

You may refer to my array formula solution.

You may refer to both solutions in this workbook.

Create a master list of unique account codes from two data sources

{2 Comments}

Assume a two column database in range C4:D11.  In cell C3, type "Cheque No." (without double quotes) and in cell D3, type "As per Bank Account" (without double quotes).  Entries in range C4:C11 are cheque numbers and in range D4:D11 are Dollar values.  Let's refer to this database as "The Bank Dump".

Assume another two column database in range I4:J9.  In cell I3, type "Cheque No." (without double quotes) and in cell D3, type "As per SAP" (without double quotes).  Entries in range I4:I9 are cheque numbers and in range J4:J9 are Dollar values.  Let's refer to this database as "The SAP Dump".

Please note that cheque numbers in range C4:C11 may or may not be present in range I4:I9.  Likewise cheque numbers in range I4:I9  may or may not be present in range C4:C11

One may want to create a reconciliation statement i.e. a three column database showing "Cheque No." in the first column, "As per Bank Account" in the second column and "As per SAP" in the third column.

Herein the first challenge is to create a master list of Account codes from two ranges, C4:C11 and I4:I9 and then the corresponding amounts from the two databases.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).

You may refer to my solution in this workbook.

Solution for all versions of MS Excel

One may use using the Data > Consolidation feature.  Please try the following steps:

1. Select cell C15 and go to Data > Consolidate
2. Under Function, choose SUM
3. Under reference, select range C3:D11 and the click on Add
4. Under reference, select range I3:J9 and the click on Add
5. Check the boxes for Top row and Left column
6. Click on OK

The result should be the three column database as desired above.

Dynamically extract unique values with multiple conditions

{6 Comments}

Assume there are two columns with data repeated in all both columns.  One may wan to extract the unique text values from a specific column based on conditions specified by end-user.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table)

Solution for all versions of MS Excel

While this solution works for all versions of MS Excel, it uses an array formula (Ctrl+Shift+Enter).  Array formulas, if used extensively in the workbook, adversely effect the system's performance.

You may refer to my solution in this workbook.

Dynamically extract unique values with no conditions

{4 Comments}

Given a single column with duplicates appearing, one may want to dynamically extract unique values to another range.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).

Solution for all versions of MS Excel

While this solution works for all versions of MS Excel, it uses an array formula (Ctrl+Shift+Enter).  Array formulas, if used extensively in the workbook, adversely effect the system's performance.

You may refer to my solution in this workbook.

Count unique values with conditions

{13 Comments}

There is a three column database with month in column A, numbers in column B and names in column C.  I want to count unique numbers in column B for a specific month and name provided by the end user.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the PowerPivot add-in, then a simple DAX formula can solve this problem.

Solution for all versions of MS Excel

While this solution works for all versions of MS Excel, it uses an array formula (Ctrl+Shift+Enter).  Array formulas, if used extensively in the workbook, adversely effect the system's performance.

You may refer to both solutions in this workbook.

To count unique values with conditions on large databases, you may refer to the following link