Calculate a unique count with conditions in a Pivot Table

{ 12 Comments }

Assume a three column table arranged as follows: Circle, Date of Fault and ID.  Dates in the date range span one week – November 26, 2012 to December 2, 2012.  A particular equipment can be only one specific Region and the same equipment an go faulty multiple times within one week.  Data for one week is about 8,400 rows.

There are three questions to be answered from this data:

1. The Circle wise, count of ID’s which went faulty more than twice between November 26, 2012 and December 2, 2012; and
2. The Circle wise, count of faulty instances more than twice between November 26, 2012 and December 2, 2012; and
3. Determine individual sites for 1 and 2 above

The difference between 1 and 2 above is “If a certain ID goes down 4 times, then for question1, the answer should be 1.  For question2, the answer should be 4.”

The first question basically boils down to “Count of unique ID’s by Circle which went faulty more than twice.”

There are two ways one can go about answering the questions above:

Solution A – For Excel 2010 and higher versions – This solution is for those using the PowerPivot MS Excel add-in for Excel 2010 and higher versions.

Solution B – For all versions of MS Excel – This solution will work in all versions of MS Excel but for those using Excel 2010 and higher versions, the PowerPivot solution would be far more efficient.

The steps for creating a pivot table under Solution B for answering both questions above are:

1.  Count of downtime sites.xlsx is saved in a folder on the desktop;
2. Open the workbook, select the data on the Base_Data sheet (including the first row as the header row – it will be range A1:C8741.  Ensure that the header row has some distinctive formatting such as Bold or some colour) and press Ctrl+F3 > New.  In the Name box, type Dummy and click on OK > Close.
3. To cross check that the name assigned above has indeed been assigned correctly, select the data range once again and in the Name box (left of the formula bar), Dummy should appear.
4. Select range A1:C8741 of the Base_Data sheet again and press Ctrl+T to convert this range into a Table.  Ensure that the “My Table has headers” box is checked.  Save the workbook.
5. Open a new worksheet and go to Data > From Other Sources > From Microsoft Query
6. Under Databases, select Excel files > OK
7. In the Directories dialog box, navigate to the folder on the desktop where the workbook file is saved.  So for me, it is saved under C:\Users\Ashish\Desktop\ and double click on the folder where the workbook is saved.
8. In the left hand side window, select the Count of downtime sites.xlsx file and click on OK
9. With Dummy selected, click on the > symbol to bring over all columns of this named range to the right hand side box 10. Click on Next three times
11. Select the option of View Data or Edit Query in Microsoft Query
12. Click on the SQL button, delete the contents in the white space there and paste the following SQL Query

SELECT ucase(dummy.Circle) AS ‘Circle’, ucase(dummy.Indus_Site_ID) AS ‘Indus_site_ID’, Count(dummy.Indus_Site_ID) AS ‘fault_frequency’
FROM `C:\Users\Ashish\Desktop\Count of downtime sites.xlsx`.dummy dummy
GROUP BY ucase(dummy.Circle), ucase(dummy.Indus_Site_ID)
HAVING (Count(dummy.Indus_Site_ID)>2)

13. Click on OK and on the message box which appears, click on OK
14. Under File, select the last option – Return Data to Microsoft Excel
15. At this stage, if you wish to get data in a tabular form, then select Table.  If you directly want a pivot table, select the second option button – Pivot Table.  For this example, select Pivot Table and in the cell reference box, select any cell where you would like to the result to appear, say cell A1.  Click on OK
16. A counter will run at the bottom left hand side with the title of Reading Data
17. Drag Circle and ID to the to the Row Labels
18. Drag Fault Frequency to the Value Area twice
19. Right click on any one number in the fault frequency column and under Summarise Value by, select Count
20. Right click on any value in the ID column and under Expand/Collapse, select Collapse Entire Field.

Leave a Comment

Your email address will not be published.

*

  • I’m stuck on a pivot table project. Data is simple timeclock records Date, employee, in outbreak, inbreak, out hours I created the pivot table to report hours per day no problem. Subtotal will report the hours per week, but of course doesn’t break out OT for hours>40. I have played with “calculated items, and fields” and I can get the breakout per week, but run into trouble for the 2 week sum. I need my report to show a two week frame with hours<40 as regular time for each week, hours>40 as OT for each week and then the sum of regular hours and OT hours for the two week period. The calculated items and fields apply the same formulas to the totals if the pivot table is filtered for the time period. I know I can add a column to the data with a sumif(week#…. type formula, but it seems there is a more elegant solution from within the pivot table. Any ideas? Goal employee/mon,tue….hours/ot/mon,tue,…hours/ot/total hours/total ot thanks!

    • Hi,

      Please be specific in asking your question. In which specific case is the result incorrect. Highlight those cells in yellow and also share your expected result.

      • Thanks for the fast reply. My issue is to how to calculate market share for various brands of various segments.
        For that I have created a measure %MS in power pivot. It is working fine for by segment by brands in pivot table. (You can see the pivot table sheet). When I remove the segment field from the pivot table it is showing % of total sale by brand which is wrong.
        And I required it in more comprehensive format which is given in require format sheet. This is showing % market share for selected brands by district. When I put the market share measure in this pivot table, it is showing the wrong market share. I think it showing % of selected brand volume / % of total sale volume.
        For example I have given three district results in the right hand side of this pivot table.

        Please help….

        • Hi,

          How did you come up with 53% in cell K7? This is what I did

          1. Filter the Database on 100 Pipers BII (column D). The total is 5250.725
          2. Filter the Database on 100 Pipers BII (column D) and Aurangabad (column C). The total is 26.2499
          3. The market share should be 26.2499/5250.725 which is 0.50%

          Please clarify.

          • As I told you I am calculating market share based on Segment sale. For example, you can see the Pivot table sheet.
            In this pivot table I took Product Category, Segment and brand in row label. Then took District in column label and Sales volume and %MS (market share) in value field. In cell reference C8 you can get the 100 Pipers BII market share is coming 53% for Aurangabad district which calculation is as following :-
            (100 Pipers BII volume of Aurangabad district)/ (Scotch Blend Prem BII (100 piper BII segment) volume of Aurangabad district). 26/49 =53%

  • Dear Ashish, I Know this solution, but this is not for giving me the result as per my requirement.
    It is giving only selected segment brands market share. But when you see the my required sheet, brand are coming from different segments in one place. Those are only my company brands.

    Please check once again.

    • Hi,

      You may refer to my solution in this workbook. I think some of your answers are wrong. You report the MS(%) of Blenders Pride in Bhagalpur as 2.2%. I think that figure should be 64.1%. Please check the figures on “Ashish solution” worksheet.

      Hope this helps.

  • Thanks Ashish, it is working.

    I saw your DAX formula in power pivot. By creating an helping column, you are calculating market share.

    Currently it is small database, but if I run the same on a big database which has 70 million records. Can it affect on the file size or slow it down?

    Can there is any possibility to create this measure without creating a helping column.

    • Hi,

      You are welcome. You will have to try out the solution on a large dataset. Using the EARLIER() function in a calculated field (measure) causes problems as mentioned at this link.

      If you can share your large file with me, I can try out my solution.