Tags: MIN

Here's a simple Sales data of a retail Store which sells Apple Products.  Since a customer can transact multiple times, there will be repetitions in the Cust ID column.  While Cust ID 123 and 782 purchased multiple products from the same Store in one transaction, Cust ID 53 purchased multiple products from different stores (Store ID 165 and 45) [and therefore the Order ID's are also different (Order ID 2 and 6)].

On this small sample, one may want to identify "Cross and up selling opportunities" i.e. one may want to know which are the Customers that can be approached for selling more products to.  So for e.g. one may want to know which Customers have bought only one product so far.  A case in point being the Apple Watch - Customer ID 2442 and 428 bought only this product.  The other Customers who bought the Apple Watch also bought atleast one more product.  Therefore, Cust ID 2442 and 428 could be approached for buying other products as well.

Solving this via conventional Excel formulas and Pivot Tables would prove to be a formidable challenge.  I have solved this problem using a Data visualisation software from Microsoft called PowerBI desktop (it can be downloaded free from the Microsoft website).  This problem can also be solved in MS Excel using Power Query and Power Pivot.

There are 3 sections in the image below - Table at the top (First Table), slicers at the right and another Table at the bottom (Second Table).

Interpretation of First Table

1. 1 appearing at the intersection of APPLE TV (row labels) and APPLE TV (column labels) represents that there is 1 customer who bought the APPLE TV
1. 1 appeaing at the intersection of APPLE TV (row labels) and MACBOOK AIR (column labels) represents that the 1 customer who bought the APPLE TV also bought the MACBOOK AIR
2. 4 appearing at the intersection of APPLE WATCH (row labels) and APPLE WATCH (column labels) represents that there are 4 customers who bought the APPLE WATCH
1. 1 appearing in other columns of the same row represents other products which those customers bought
2. When one right click's on APPLE WATCH and selects "Drill down", one will be able to see the Customers who bought the other products as well.
1. Customer 53 bought the APPLE WATCH, AIRPORT and IPHONE 8S.  Customer 123 bought APPLE WATCH, IPHONE X and IPOD
2. Customers 2442 and 428 did not buy any other product

Interpretation of Second Table

This table shows a list of Customers (and their transaction details) who bought only and only that one product selected by the user in the filter section (see the red oval selection in the image).  So these two customers could be approached for selling more products to.

Refer to this simple Sales dataset

The objective is to create a simple matrix with months in the row labels, years in the column labels and sales figures in the value area section.  The twist in the question is that for years prior to the current year (2018 in this dataset), sales should only appear till the month for which there is data for the current year.  For e.g., for 2018, data is only till Month 4 and therefore for prior years as well, data should only appear till Month 4.  As and when Sales data gets added below row 17, data for prior years should also go up to that month.

The expected result is

You may download my PBI file from here. The same solution can be obtained in Excel as well (using Power Query and PowerPivot).

This table contains a list of all the inspections created and completed within different time periods.

The objective is to create two Tables from this single table - one showing the Accounts created within the chosen time period and another showing the those that were closed within the same time period.  Here are screenshots of the expected results.

You may download my PowerBI desktop solution workbook from here.  The same solution can be obtained in Excel as well (using Power Query and PowerPivot).

Consider a simple dataset as shown below:

For each tier, the tier rate is incrementally applied to the volume within the tier volume range.  Given the following transaction volumes, one may want to compute the transaction fee

The expected result is shown below

As one can observe, for a transaction value of 400,000, the fee has been computed as 3% on the first 100,000 and 2.5% on the next 300,000. You may download my solution workbook from here. In the file, I have shared 2 solutions - a conventional formula based one and a PowerPivot solution.

I have also solved a similar question here.

Visualise a 5 column dataset as show below.  This is a very small sample of the actual dataset.  It shows the date on which supplies were received for each item from Vendors and whether those supplies had errors in them.  Finally those identified errors have been bucketed into relevant categories.  The Item ID# is a code assigned by the Client Organisation.  The format of the ID# is X [Last two digits of year of receipt] [Four digit Number].  For items received from a certain vendor, the last four digits of any item will always be in ascending order of the Date received.  In other words. for items received from a certain vendor, the last 4 digits of an item received on 13 January 2017 will always be greater than the last 4 digits of an item received on 11 January 2017 from the same vendor.  There will never be repetitions in the Item ID# column.

 Vendor Name Item ID# Date Received Error with Item? Error Category California X170016 1-16-2017 No California X170014 1-13-2017 Yes Labeling Error California X170015 1-13-2017 Yes Packaging Error California X170008 1-9-2017 Yes Quality Issue California X170003 1-2-2017 No California X160645 12-26-2016 Yes Packaging Error California X160646 12-26-2016 No California X160644 12-25-2016 Yes Labeling Error California X160638 12-20-2016 Yes Quality Issue California X160633 12-15-2016 No California X160626 12-8-2016 No California X160625 12-7-2016 Yes Packaging Error California X160624 12-5-2016 Yes Labeling Error California X160618 11-23-2016 Yes Quality Issue California X160613 11-13-2016 No California X160606 10-30-2016 No

The objective is to compute the error rate by vendor and Error category for the 10 most recent transaction dates with that specific vendor.  So, for vendor Name "California" and  Error category as "Packing Error", this ratio should be computed as = Number of packing Errors on 10 most recent dates/10.

Here is a simple snapshot of the Data for California.  I have filtered the dataset where Vendor Name is California and then sorted the Date received column in descending order.  Please note that when i filter the dataset on California, a lot more rows are returned.  I am only showing the Top 10 rows here because that is what is important for solving this question.

 Vendor Name Item ID# Date Received Error with Item? Error Category California X170016 1-16-2017 No California X170014 1-13-2017 Yes Labeling Error California X170015 1-13-2017 Yes Packaging Error California X170008 1-9-2017 Yes Quality Issue California X170003 1-2-2017 No California X160645 12-26-2016 Yes Packaging Error California X160646 12-26-2016 No California X160644 12-25-2016 Yes Labeling Error California X160638 12-20-2016 Yes Quality Issue California X160633 12-15-2016 No

The expected result is:

 Row Labels Labeling Error Packaging Error Quality Issue Factory Error Boise 30.00% California 20.00% 20.00% 20.00% Detroit 70.00% 30.00% Ekalaka 20.00% 20.00%

I have solved this problem using the PowerPivot. You may download my solution workbook from here.

Here is a dataset showing Project wise forecast of open opportunities.

1. Topic is the Project Name
2. Est. Close Date is the date by when the opportunity would be closed i.e. the project would be won from that Client
3. Duration is the time (in months) for which the project would run
4. Amount is the total amount that would be billed for that project

Clients are invoiced annually only. So in the example below:

1. Project ABC is for US\$1 million with a duration of 24 months and is expected to be closed in Oct. 2017.  We need to model the data to show the billing every 12 months.  So for ABC US\$500K would be billed in Oct-2017 and another US\$500K in Oct-2018.
2. Project GEF is for US\$2 million with a duration of 18 months and is expected to be closed in Feb. 2018. We need to model the data to show US\$1.3 million in Feb-2018 and another US\$666K in Feb-2019.  The monthly billing is US\$2 million divided by 18 and then multiplied by 12 - this amounts to US\$1.3 million.
 Topic Est. Close Date Duration (Months) Amount ABC 01-10-2017 24 1,000,000 GEF 01-02-2018 18 2,000,000 XYZ 01-03-2018 30 1,000,000

The expected result should look like this:

 Row Labels Oct-17 Feb-18 Mar-18 Oct-18 Feb-19 Mar-19 Mar-20 Total ABC 500,000 500,000 1,000,000 GEF 1,333,333 666,667 2,000,000 XYZ 400,000 400,000 200,000 1,000,000 Grand Total 500,000 1,333,333 400,000 500,000 666,667 400,000 200,000 4,000,000

I have solved this problem using Power Query and PowerPivot. You may download my solution workbook from here.

Imagine a fixed monthly amount due to an Organisation for services rendered to various customers.  While an invoice is raised every month by this Organisation, not all pay up the dues on time.  For unpaid dues, the Organisation charges its client interest ranging from 3% to 9% per annum.  The objective is to determine cumulative interest payable by various customers to Organisation X.

The base data looks like this

 Client Monthly revenue Int. calculation start date Int. calculation end date Interest rate Client A 33,967 01-Aug-16 25-Jul-17 9.00% Client B 123 12-Sep-16 30-Nov-17 4.00%

Given the dataset above, the total interest payable by Client A is Rs. 16,237.20.  The calculation is shown below:

 From To Days for which interest should be paid Principal Interest 02-Aug-16 31-Aug-16 328.00 33,967.00 2,745.26 01-Sep-16 30-Sep-16 298.00 33,967.00 2,494.17 01-Oct-16 31-Oct-16 267.00 33,967.00 2,234.71 01-Nov-16 30-Nov-16 237.00 33,967.00 1,983.62 01-Dec-16 31-Dec-16 206.00 33,967.00 1,724.16 01-Jan-17 31-Jan-17 175.00 33,967.00 1,464.70 01-Feb-17 28-Feb-17 147.00 33,967.00 1,230.34 01-Mar-17 31-Mar-17 116.00 33,967.00 970.88 01-Apr-17 30-Apr-17 86.00 33,967.00 719.79 01-May-17 31-May-17 55.00 33,967.00 460.33 01-Jun-17 30-Jun-17 25.00 33,967.00 209.24 01-Jul-17 25-Jul-17 - 33,967.00 - Total 16,237.20

You may download my solution workbook with from here. I have solved this problem using normal Excel formulas and the PowerPivot.

Here is a sample dataset of learners who have cleared different stages of multiple courses on offer within an Organisation:

 Learner Stage completed Course Bill Stage 1 Public Speaking Bill Stage 2 Public Speaking Bill Stage 3 Public Speaking Susan Stage 1 Effective Communication Bob Stage 1 Public Speaking Bob Stage 2 Public Speaking Sheila Stage 1 Effective Communication Sheila Stage 2 Effective Communication Sheila Stage 3 Effective Communication Frank Stage 1 Effective Communication Frank Stage 2 Effective Communication Henry Stage 1 Public Speaking Henry Stage 2 Public Speaking Bill Stage 1 Effective Communication Bill Stage 2 Effective Communication

From this sample dataset, one may want to know how many participants have completed each stage of these multiple courses.  The expected result is shown below:

 Row Labels Stage 1 Stage 2 Stage 3 Effective Communication 1 2 1 Public Speaking 2 1 Grand Total 1 3 2

In this workbook, I have shared 2 solutions - one using formulas and the other using the Power Query & PowerPivot.

Hi,

Assume a column of names as follows:

Name
Mohammed Zia-Ul Haque
Steven Thomas -
,-Rohit Sunil Ahir-Chowdhary.-
Anuj -----------
Sameer --
..,Mohit --
Rajeev Nair.
Monalisa . Das
Vijeta ...
--,.Anjana. M.U..,-

Please observe that there are special characters before the name, within the name and after the name.  The task is to remove special characters before and after the name.  The expected result is shown below:

Expected Result
Mohammed Zia-Ul Haque
Steven Thomas
Rohit Sunil Ahir-Chowdhary
Anuj
Sameer
Mohit
Rajeev Nair
Monalisa . Das
Vijeta
Anjana. M.U

The array formula (Ctrl+Shift+Enter) to make this work is

=MID(A2,MIN(SEARCH(CHAR(ROW(\$A\$65:\$A\$90)),A2&CHAR(ROW(\$A\$65:\$A\$90)))),LOOKUP(2,1/((CODE(MID(UPPER(A2),ROW(INDIRECT("1:"&LEN(A2))),1))>=65)*(CODE(MID(UPPER(A2),ROW(INDIRECT("1:"&LEN(A2))),1))<=90)),ROW(INDIRECT("1:"&LEN(A2))))-(MIN(SEARCH(CHAR(ROW(\$A\$65:\$A\$90)),A2&CHAR(ROW(\$A\$65:\$A\$90)))))+1)

I have solved a similar problem at this link as well but that requires the usage of an add-in.  This is so because the special characters and numbers need to be removed from within the string as well.  In other words, everything except letters need to be removed from the alphanumeric string (no matter where the numbers and special characters are - beginning, middle or at the end).

Scenario 1

In a Stock Broking business (Firm), Customers deposit margin money through cheques. Although cheques are received from customers at branches, clearing (realization) of the same normally takes 3-4 days.  On one hand, the Firm wishes to allow trading to the customers against the deposited cheques from the day it receives the cheque, on other hand there is a risk associated with this  i.e cheque may get dishonored and the Firm may incur huge financial losses because of allowing customers to trade in anticipation of the cheque realization. Hence, there is a trade off between customer centricity & Financial Risk management.

The Firm may decide to take a calculated risk in order to retain its customer centric approach (it is a delightful experience for customers if they get trading benefit against the deposited cheque instantly before realization). The Firm may decide to give the benefit of cheque which are below a set threshold amount (say 50,000) instantly and for cheques which are above threshold amount the credit will be given only after realization. Sometimes there are  multiple unrealized cheques deposited by one client but the Firm needs to cap the total benefit of unrealized cheques to 50,000 (partial benefit against any cheque is not possible i.e. either a cheque will be considered in full or not considered) E.g. if a customer deposits 3 cheques of 40,000 each then also the total instant benefit against all unrealized cheques shall be capped at 50,000 - hence only one cheques of 40,000 will be credited to customers ledger before realization.

Now the question here is with respect to selection of cheques to give best results i.e. maximum total below 50,000.  Consider the following example:

Cheque1: 55,000
Cheque2: 44,000
Cheque3: 28,000
Cheque4: 7,000
Cheque5: 6,500
Cheque6: 4,000
Cheque7: 3,000
Cheque8: 500

The Firm wants to identify the "Combination of cheques" which gives the maximum benefit to the Customer subject to a limit of 50,000.  The result should be cheques 3-8.  They add up to 49,000.

Here's another example:

Cheque1: 55,000
Cheque2: 45,000
Cheque3: 25,000
Cheque4: 5,000

The result here should be cheque 2 and 4.  They add up to 50,000.

Here are my thoughts:

1. I have assumed that the total number of cheques received from customer will be capped to 8.  This limit can be relaxed but processing will then take time.

2. Create all possible combinations of cheques which can be credited.  So, since no partial credit is allowed, the cheque can either be accepted in full or be rejected.  Hence the possible values can be 2 i.e. 1 (Accepted) or 0 (Rejected).

3. So if there are 8 cheques received from a certain customer, then there will be 2^8 i.e. 256 possible combinations.  This can also be computed by using the =PERMUTATIONA(2,8) formula.  I'd like to than Saurabh Gupta for sharing the formula to generate all these combinations.

4. After creating all possible combinations, add the cheque amount for each combination.

5. Scan this total column and highlight the one which is the largest value <= the threshold value i.e. 50,000.

You may refer to my solution in the Solution worksheet of this workbook.

Scenario 2

Here's another example.  Say, one has a list of numbers as shown below:

1072.94
1643.02
1248.09
2167.34
276.85
322.05
910.22
1146.39
56.50

Now one may want to know which combination of numbers adds up to or comes closest to 5115.52 (our lookup value).  If none of the combinations adds up exactly to the lookup value, then return the combination which adds to the largest value less then equal to the lookup value.  In the example above, the second, third and fourth add up to 5114.95.

You may find my solution in this workbook.