# Tags: SQL QUERY

Assume a three column table arranged as follows: Circle, Date of Fault and ID.  Dates in the date range span one week - November 26, 2012 to December 2, 2012.  A particular equipment can be only one specific Region and the same equipment an go faulty multiple times within one week.  Data for one week is about 8,400 rows.

There are three questions to be answered from this data:

1. The Circle wise, count of ID's which went faulty more than twice between November 26, 2012 and December 2, 2012; and
2. The Circle wise, count of faulty instances more than twice between November 26, 2012 and December 2, 2012; and
3. Determine individual sites for 1 and 2 above

The difference between 1 and 2 above is "If a certain ID goes down 4 times, then for question1, the answer should be 1.  For question2, the answer should be 4."

The first question basically boils down to "Count of unique ID's by Circle which went faulty more than twice."

There are two ways one can go about answering the questions above:

Solution A - For Excel 2010 and higher versions - This solution is for those using the PowerPivot MS Excel add-in for Excel 2010 and higher versions.

Solution B - For all versions of MS Excel - This solution will work in all versions of MS Excel but for those using Excel 2010 and higher versions, the PowerPivot solution would be far more efficient.

The steps for creating a pivot table under Solution B for answering both questions above are:

1.  Count of downtime sites.xlsx is saved in a folder on the desktop;
2. Open the workbook, select the data on the Base_Data sheet (including the first row as the header row – it will be range A1:C8741.  Ensure that the header row has some distinctive formatting such as Bold or some colour) and press Ctrl+F3 > New.  In the Name box, type Dummy and click on OK > Close.
3. To cross check that the name assigned above has indeed been assigned correctly, select the data range once again and in the Name box (left of the formula bar), Dummy should appear.
4. Select range A1:C8741 of the Base_Data sheet again and press Ctrl+T to convert this range into a Table.  Ensure that the “My Table has headers” box is checked.  Save the workbook.
5. Open a new worksheet and go to Data > From Other Sources > From Microsoft Query
6. Under Databases, select Excel files > OK
7. In the Directories dialog box, navigate to the folder on the desktop where the workbook file is saved.  So for me, it is saved under C:\Users\Ashish\Desktop\ and double click on the folder where the workbook is saved.
8. In the left hand side window, select the Count of downtime sites.xlsx file and click on OK
9. With Dummy selected, click on the > symbol to bring over all columns of this named range to the right hand side box 10. Click on Next three times
11. Select the option of View Data or Edit Query in Microsoft Query
12. Click on the SQL button, delete the contents in the white space there and paste the following SQL Query

SELECT ucase(dummy.Circle) AS 'Circle', ucase(dummy.Indus_Site_ID) AS 'Indus_site_ID', Count(dummy.Indus_Site_ID) AS 'fault_frequency'
FROM `C:\Users\Ashish\Desktop\Count of downtime sites.xlsx`.dummy dummy
GROUP BY ucase(dummy.Circle), ucase(dummy.Indus_Site_ID)
HAVING (Count(dummy.Indus_Site_ID)>2)

13. Click on OK and on the message box which appears, click on OK
14. Under File, select the last option – Return Data to Microsoft Excel
15. At this stage, if you wish to get data in a tabular form, then select Table.  If you directly want a pivot table, select the second option button – Pivot Table.  For this example, select Pivot Table and in the cell reference box, select any cell where you would like to the result to appear, say cell A1.  Click on OK
16. A counter will run at the bottom left hand side with the title of Reading Data
17. Drag Circle and ID to the to the Row Labels
18. Drag Fault Frequency to the Value Area twice
19. Right click on any one number in the fault frequency column and under Summarise Value by, select Count
20. Right click on any value in the ID column and under Expand/Collapse, select Collapse Entire Field.

Assume a four column input data range (refer sheet named "Input" of this workbook) arranged as follows:

1. First tow columns are Group and Type which have text values
2. Third column is a month column with entries such as Dec_11 and Sep_12.  These denote 12 months ended December 2011 and 9 months ended 2012
3. Fourth column is Revenue which has numbers

The objective is to compute pro rata growth rate of Sep_12 revenue over Dec_11 revenue within a Pivot Table.  In the workbook (see link above), I tried to use a calculated item formula within a Pivot Table to compute this but encountered three difficulties in doing so (refer Notes section of sheet named "Result of Calc item formula ").

There are two ways one can go about answering the questions above:

Solution A – For Excel 2010 and higher versions – This solution is for those using the PowerPivot MS Excel add-in for Excel 2010 and higher versions.

Solution B – For all versions of MS Excel – This solution will work in all versions of MS Excel but for those using Excel 2010 and higher versions, the PowerPivot solution would be far more efficient.

The steps for creating a pivot table under Solution B are:

The Objective is to change the orientation of the base data such that a calculated field formula can be written within a Pivot table instead of a calculated item formula.  The process to change the orientation of the base data is described below:

1. Select range A2:D14 of Input sheet and press Ctrl+F3 > New.  In the name box, type Dummy
2. Select range A2:D14 of Input sheet and press Ctrl+T to convert to a Table
3. Open a third worksheet in the workbook (downloaded from the link above) and save the workbook, say on any folder on the desktop
4. While the active cell is any cell of the new worksheet, go to Data > From Other Sources > From Microsoft Query
5. Under Databases, select Excel files > OK
6. In the Directories dialog box, navigate to the folder on the desktop where the workbook is saved.  So for me, it is saved under C:\Users\Ashish\Desktop\ and click on the folder where the workbook is saved
7. In the left hand side window, select the workbook and click on OK
8. With Dummy selected, click on the > symbol to bring over all columns of this named range to the right hand side box
9. Click on Next three times
10. Select the option of View Data or Edit Query in Microsoft Query
11. Click on the SQL button and delete whatever you see in the box there
12. Enter the following SQL statement in the white box

Transform sum(rev)
SELECT Dummy.`GROUP`, Dummy.TYPE
FROM Dummy
Group by Dummy.`GROUP`, Dummy.TYPE
Pivot Mth

13. When you click on OK, you will see a four column database (refer sheet named "Result of MS Query")
14. Under File, select the last option – Return Data to Microsoft Excel
15. At this stage, if you wish to get data as you saw in MS Query then select Table.  If you directly want a pivot table, select the second option button – Pivot Table.  Select any cell where you would like to the result to appear, say cell A1. 16. Click on OK.  A counter will run at the bottom left hand side with the title of Reading Data
17. If you had selected pivot table in step 15 above, then the pivot table grid/layout will appear
18. You may now drag fields to create a pivot table
19. You may now write a calculated field formula within the Pivot Table (refer sheet named "Output")

As can be observed in the "Output" sheet, all the shortcomings mentioned in the Notes section of the "Result of Calc item formula" sheet have been overcome.

Assume a two column database of approximately 20,000 rows (say from A1:B20237).  The first column has vendor names and the second column has Part codes (alphanumeric string).  Headings are in A1:B1, say Vendor in cell A1 and Part_code in cell B1.  One vendor supplies multiple parts and therefore there would be repetitions of vendor names in column A.

The task is to reorient this database to show all part numbers in column A and parts supplied by those vendors should appear in different columns of that row.  If one attempts to create a pivot table, the problem would be with the data area of the pivot table.  When one drags the part codes (alphanumeric string) column to the data area, count of part number would appear instead of the actual part number.  This is the very nature of a pivot table.  Since it is a summarisation tool, one can never see text entries in the data area of a pivot table.

You may download the workbook showing both solutions (using MS Query and Power Query) from here.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (minimal formulas) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).  You may refer to Solution2 - Power Query worksheet.

You may watch a short video of my solution here

Solution for all versions of MS Excel

You may refer to Solution1 - MS Query worksheet.  To reorient the data i.e. to view text entries in the data area of a pivot table, try the following approach:

1. In cell C1, type Count.  So the value in cell A1 is Vendor, in cell B1 is Part_code and in cell C1 is Count
2. In cell C2, enter =\$B\$1&" "&COUNTIF(A\$2:A2,A2) and copy down.  Since the number of rows here are quite many, i would suggest copying in batches i.e. first copy down 5,000 rows, then another 5,000 rows and so on.  This is solely to prevent Excel from crashing and/or taking unduly long to process.
3. Select range A1:C20237 and press Ctrl+F3 > New.  In the name box, enter Dummy
4. Select range A1:C20237 and press Ctrl+T to convert to a Table.  This is to take care of data being added beyond row 20237.
5. Save the file and open a new worksheet (in the same workbook)
6. While on any cell in this new worksheet, go to Data > From Other Sources > From Microsoft Query > Excel Files > OK
7. Navigate to the folder where you saved the file in step 5 above, select the file and click on OK
8. With Dummy selected by default, press the > symbol.  This will show you three column headings in the right hand side box
9. Click on Next three times
10. Select the second option button and click on Finish
11. Click on the SQL button and delete all contents that you see there.  Type the following in the blank white space

TRANSFORM first(Part_code)
SELECT Vendor
FROM dummy
GROUP BY Vendor
PIVOT count

12. Click on OK
13. Go to Data > Return Data to MS Excel
14. In the Import Data box, select Table and in Existing sheet, select cell A1
15. Click on Finish

Data should now appear in the desired format.

For any changes to data in range A2:B20237, right-click on any cell in the output range and select Refresh.  Furthermore, since the range has been converted to a Table, you may even add further rows of data and simply right-click to Refresh the output data range.

To improve performance of your workbook, you may want to copy the formulas in column C and paste them as Values (Paste Special > Values).  Once again, do not paste special all values at once - do so in groups of 5,000 rows to prevent Excel from crashing and/or taking unduly long to process.