Tags: FREQUENCY

Compute configuration count using Set Theory and Venn Diagrams

{0 Comments}

Assume a table which lists attendees for a Company's Annual day function.  In this Table, data for every attendee is shown on a separate row so if an employee attends the function with his/her spouse and three children, then there will be 5 rows for that employee.

The question is to determine the count of the following family configuration:

1. Employees only (those who attended without spouse and children); and
2. Employees, spouse and children (Family); and
3. Employees and spouse (no children); and
4. Employees and children (no spouse)

You may refer to my solution in this workbook.  I have solved this problem using:

1. MS Excel Formulas based on Set Theory and Venn Diagram; and
2. PowerPivot

Compute MODE of all numbers split across multiple worksheets

{2 Comments}

Assume numbers are typed in range A1:A2 of multiple worksheets in a workbook.  The task is the compute the MODE of these numbers.  Mode is defined as the value which appears most frequently in a range of cells.  So, if one types 1,3,4,3,5,6 in range A1:A6, then the mode will be 3 - 3 appears maximum number of times in the range.

In MS Excel, there is a built in way to compute the MODE.  The formula for the same is

=MODE(A1:A6)

Unfortunately, MODE() is not a 3D function and therefore, something like this return a #REF error

=MODE(sheet1:sheet3!A1:A6)

This behavior seems somewhat vague because other basic Mathematical and Statistical functions such as SUM(), COUNT(), AVERAGE(), MAX(), MIN(), VAR(), and STDEV() work just fine across multiple worksheets.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem.

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).  In Power Query, there is a feature to append data from multiple worksheets into one running range.  Once this is done, the running range can be transferred to an MS Excel worksheet on which the MODE function can be used.

Solution for all versions of MS Excel

To compute MODE across multiple worksheets, you may refer to my solution in this workbook.

Summarise data from multiple sheets with multiple conditions – Part II

{64 Comments}

Assume date wise revenue data has been shown on two sheets - Sports wear and Fitness equipment (Data worksheets).  On each of these two worksheets, there are two columns - Date and Revenue.  On the Summary worksheet, the user would like to summarise data (Count instances and sum revenue figures) based on multiple conditions.  The conditions for summarising are:

1. User specifies upper and lower revenue figures - Revenue figures from both sheets which satisfy the conditions (two conditions) should get added as well as rows should get counted (to know instances)
2. User specifies upper and lower date limits - Revenue figures from both sheets which satisfy the conditions (two conditions) should get added as well as rows should get counted (to know instances)
3. User specifies upper and lower revenue figures AND User specifies upper and lower date limits - Revenue figures from both sheets lying between these figures and dates should get added as well as rows should get counted (to know instances)

The restrictions are as follows:

1. Do not want to consolidate data from two data worksheets into a single worksheet
2. Do not want to perform any ancillary calculations on the two data worksheets

If these conditions are relaxed, then alternative solution such as Pivot Tables and SUMPRODUCT() function can be used.

You may refer to my solution in this workbook.

I have solved a related problem of summarising data from multiple worksheets here.

Dynamically extract unique values from a filtered range

{0 Comments}

Data in a two column database (A3:B13) can be Auto filtered on column A with one or many conditions.  Data in column B will be numbers only (positive, negative or 0's) - no text values.

After filtering data in column A, one may want to extract the unique numbers from column B to cell A18 (and then downwards).  Data from cell A18 downwards should keep changing with different filter conditions specified in the Auto filter drop down of column A.

You may refer to my solution in the this workbook.

To extract unique values from an unfiltered range without any conditions, you may refer to the following link

To extract unique values from an unfiltered range with multiple conditions, you may refer to the following link

Determine the maximum number of consecutive 1′s appearing in a range

{19 Comments}

Assume a database where customers are listed from cell A6 down.  From cell B5 to the right months are entered from April to March (B5:M5).  In B6:M6 (Customer 1), a user enters 1's and 0's.  A value of 1 respresents "Cheque bounced" and 0 represents "Cheque honoured".  Similar data is entered for other customers in B7:M500.

One may want to know the maximum number of consecutive "Cheque bounce events" for all customers listed in column A without using spare rows and columns.

In cell N6, enter the following array formula (Ctrl+Shift+Enter)

=IF(MAX(FREQUENCY(IF(B6:M6=1,COLUMN(B6:M6)),IF(B6:M6=0,COLUMN(B6:M6))))=1,0,MAX(FREQUENCY(IF(B6:M6=1,COLUMN(B6:M6)),IF(B6:M6=0,COLUMN(B6:M6)))))

Shade alternate band of rows in a filtered range

{0 Comments}

Assume the following numbers in range C5:C22.  Heading is in cell C4.

Range C5:C7 - 11
Range C8:C11 - 14
Range C12:C13 - 23
Range C14-C19 - 56
Range C20-C22 - 78

One may want to colour cells for every change in number in range C5:C22 with the following two modifications:

1. Shading should be for alternate band of rows i.e. C5:C7 should be coloured, C8:C11 should not be coloured, C12:C13 should be coloured, C14:C19 should not be coloured and C20:C22 should be coloured.

2. Shading should vary with the range being filtered i.e. If range C5:C22 is filtered to exclude 14 and 56, then numbers 11 and 78 should be coloured i.e. range C5:C7 and C20:C22 should be coloured.

To accomplish this, try the following:

a. Select cell C5 and go to Home > Format > Conditional formatting > New Rule > Use a formula to determine which cells to format

b. In the formula box there, enter the following formula

=AND(ISODD(COUNT(1/FREQUENCY(IF(SUBTOTAL(3,OFFSET(C$5,ROW(C$5:C5)-ROW(C$5),)),MATCH(C$5:C5,C$5:C5,0)),ROW(C$5:C5)-ROW(C$5)))),COUNT(1/FREQUENCY(IF(SUBTOTAL(3,OFFSET(C$5,ROW($C$5:$C$22)-ROW(C$5),)),MATCH($C$5:$C$22,$C$5:$C$22,0)),ROW($C$5:$C$22)-ROW(C$5)))>1)

c. Click on Format and select any Fill colour

d. Click on OK > OK > Apply

e. Copy cell C5, select range C6:C22 and right-click > Paste Special > Formats

With no filter in place, colouring will happen as mentioned in point 1 above.  Auto filter the range now on some values and the colour banding should change.

Extract n th. most frequently occurring item from a database

{2 Comments}

Assume a database in range D5:D11 with the following values - Black, Black, Purple, Purple, Grey, Grey, Grey.  To extract the most frequently occurring item, one may use the following formula

=INDEX($D$5:$D$11,MODE(MATCH($D$5:$D$11,$D$5:$D$11,0)))

To extract the second most frequently occurring item, try the following:

1. Enter 2 in cell C16:
2. In cell D16, enter the following array formula (Ctrl+Shift+Enter)

=LOOKUP(2,1/(LARGE(FREQUENCY(IF(INDEX(COUNTIF($D$5:$D$10,$D$5:$D$10),0),MATCH($D$5:$D$10,$D$5:$D$10,0)),ROW($D$5:$D$10)-ROW($D$4))+(ROW($D$5:$D$11)/10000),C16)=FREQUENCY(IF(INDEX(COUNTIF($D$5:$D$10,$D$5:$D$10),0),MATCH($D$5:$D$10,$D$5:$D$10,0)),ROW($D$5:$D$10)-ROW($D$4))+(ROW($D$5:$D$11)/10000)),$D$5:$D$10)

To extract the n th most frequently occurring item, enter that n th number in cell C16.

Remove duplicates after satisfying additional conditions

{0 Comments}

Assume a two column database of patient ID's and service availed.  One patient may avail the same service multiple times in a year due to which that record may appear as many times as the service is availed.  For e.g., if patient A001 avails the Radiology service twice, then A001 and Radiology will appear in two rows.

Once may want to create the following two reports from this database:

1. A list containing all those records where the patient availed just one service; and
2. A list containing all those records where the patient availed more than one service

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).

Solution for all versions of MS Excel

While this solution works for all versions of MS Excel, it uses a combination of array formulas (Ctrl+Shift+Enter) and Advanced Filters.  Array formulas, if used extensively in the workbook, adversely effect the system's performance.

For better understanding of the question and to view the final solution, please refer to this workbook.

Dynamically transpose data after ignoring blank cells

{0 Comments}

In range A7:A15, assume the following values A, S , empty cell, empty cell, R, A, W, A, SD.  Starting from cell B17, transpose data to the right (ignoring the blank cells).  Therefore, in range B17:F17, the result should be A, S, R, W SD.

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).

You may watch a short video of my solution here

Solution for all versions of MS Excel

You may refer to my array formula solution.

You may refer to both solutions in this workbook.

List down most frequently appearing names in descending order of frequency

{0 Comments}

Given a list of names in a worksheet range, one may want to pull out, the most frequently appearing names in descending order of frequency, to another range.  Furthermore, one may want to provide an additional date criteria as well.

You may refer to my solution in this workbook.