Performing calculations or writing formulas on large databases pose the following major problems:
1. Time taken to process is unduly long; and
2. File size increases disproportionately
A typical problem is fetching data from another worksheet or workbook via the VLOOKUP() function. Quite often, when one confirms the VLOOKUP() formula in a single cell with the Enter key, then at the bottom right of your MS Excel screen (the task bar), a white progress bar appears which reads "Calculating (4 processors) 1%". Once has to wait for the processing to complete before copying the formula down. The process of copying is obviously just as time consuming and resource intensive.
In these situations, a workaround is the JOIN the tables in MS Access and then get the data back in MS Excel. For accomplishing this, one must have at least some working knowledge of "SQL Query" writing.
If you are using the Power Query add-in for MS Excel 2010 and higher versions, then one can fetch data from another worksheet or workbook without writing a single formula. Furthermore, there is absolutely no lag in generating the result.
You may refer to my Power Query solution in this workbook (This is a 20 MB file)
You may watch a short video of my solution here
Assume that there are three separate tables showing the following information:
1. Date of visit data for visitors to a certain recreation facility. The same visitor may visit the facility multiple times
2. Fee per visit in different cites
3. Region in which each city lies
The question is to analyse the three tables above via a Pivot Table to generate the following:
1. Region wise and visitor wise:
a. Fee per visit
b. Frequency of visit
2. Revenue collected by month wise and by visitor
In this workbook, I have shared two solutions:
1. PowerPivot solution - This solution answers both questions above without using any calculated/ancillary columns in the base data. By establishing simple relationships in the PowerPivot window and by writing two calculated Field formulas, both questions above have been answered. To use this PowerPivot solution, you need to be using the PowerPivot add-in for MS Excel. This add-in is only available for Excel 2010 and higher versions.
2. Pivot Table solution - This solution answers both questions above by using calculated/ancillary columns in the base data.
Here's another example. Assume a four column table showing Date of session, Client, Location and Participant Name. Assume another four column table showing Client, Date of session, Date of invoice and Amount Billed. The task is to determine the Amount billed per location. You may refer to my PowerPivot and Pivot Table solution in this workbook.
Assume a two column database of patient ID's and service availed. One patient may avail the same service multiple times in a year due to which that record may appear as many times as the service is availed. For e.g., if patient A001 avails the Radiology service twice, then A001 and Radiology will appear in two rows.
Once may want to create the following two reports from this database:
1. A list containing all those records where the patient availed just one service; and
2. A list containing all those records where the patient availed more than one service
Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem
Solution for MS Excel 2010 and higher versions
If you are using the PowerPivot add-in, then a calculated column formula can resolve this problem.
Solution for all versions of MS Excel
I have shared two solutions here:
1. Array formula and advanced filters; and
2. Only array formulas
For better understanding of the question and to view the final solution, please refer to this workbook.
The VLOOKUP() function returns data from a lookup_array irrespective of the filter setting of the lookup_array. To make the lookup_array of the VLOOKUP() function work only on the visible cells of a filtered range, refer to this workbook.
One may want to create interactive graphs by allowing users to select indicators and years to be plotted (via check boxes and scroll bars). I have presented three cases in this workbook.
Validate a range of cells based on multiple conditions - the exact question can be read from here. You may refer to my solution in the this file.
The VLOOKUP() function accepts a definite lookup array (second input in the VLOOKUP()) specified by the end-user. In a situation where the lookup array is unknown, a function may have to cycle through various lookup ranges to cull out the required value.
You may refer to my solution in this workbook.
Assume multiple worksheets in a workbook (all having the same structure). In a summary sheet, one may want to tabulate data from these multiple worksheets. Essentially, a way to link to the same specific cells in multiple worksheets to the Summary sheet. The formula in the Summary sheet should just be copied and pasted to multiple rows to bring over data from the multiple worksheets.
You may refer to my solution in this workbook.