Home > Enterprise >  Rank order data
Rank order data

Time:01-07

I have the loan dataset below -

Sector Total Units Bad units Bad Rate
Retail Trade 16 5 31%
Construction 500 1100 20%
Healthcare 165 55 33%
Mining 3 2 67%
Utilities 56 19 34%
Other 300 44 15%

How can I create a ranking function to sort this data based on the bad_rate while also accounting for the number of units ?

e.g This is the result when I sort in descending order based on bad_rate

Sector Total Units Bad units Bad Rate
Mining 3 2 67%
Utilities 56 19 34%
Healthcare 165 55 33%
Retail Trade 16 5 31%
Construction 500 1100 20%
Other 300 44 15%

Here, Mining shows up first but I don't really care about this sector as it only has a total of 3 units. I would like construction, other and healthcare to show up on the top as they have more # of total as well as bad units

CodePudding user response:

STEP 1) is easy... Use SORT("Range","ByColNumber","Order") Just put it in the top left cell of where you want your sorted data.

=SORT(B3:E8,4,-1):

enter image description here enter image description here


STEP 2)

Here's the tricky part... you need to decide how to weight the outage.
Here, I found multiplying the Rate% by the Total Unit Rank:

enter image description here

I think this approach gives pretty good results... you just need to play with the formula!
Please let me know what formula you eventually use!

CodePudding user response:

You would need to define sorting criteria, since you don't have a priority based on column, but instead a combination. I would suggest defining a function that weighs both columns: Total Units and Bad Rate. Using a weight function would be a good idea, but first, we would need to normalize both columns. For example put the data in a range 0-100, so we can weigh each column having similar values. Once you have the data normalized then you can use criteria like this:

w_1 * x   w_2 * y

This is the main idea. Now to put this information in Excel. We create an additional temporary variable with the previous calculation and name it crit. We Define a user LAMBDA function SORT_BY for calculating crit as follows:

LAMBDA(a,b, wu*a   wbr*b)

and we use MAP to calculate it with the normalized data. For convenience we define another user LAMBDA function to normalize the data: NORM as follows:

 LAMBDA(x, 100*(x-MIN(x))/(MAX(x) - MIN(x)))

Here is the formula:

=LET(wu, 0.6, wbr, 0.8, u, B2:B7, br, D2:D7, SORT_BY, LAMBDA(a,b, wu*a   wbr*b),
 NORM, LAMBDA(x, 100*(x-MIN(x))/(MAX(x) - MIN(x))),
 crit, MAP(NORM(u), NORM(br), LAMBDA(a,b, SORT_BY(a,b))),
 DROP(SORT(HSTACK(A2:D7, crit),5,-1),,-1))

You can customize how to weight each column (via wu for Total Units and wbr for Bad Rates columns). Finally, we present the result removing the sorting criteria (crit) via the DROP function. If you want to show it, then remove this step.

If you put the formula in F2 this would be the output: excel output

  • Related