Home > front end >  ArrayFormula which returns the second last matching
ArrayFormula which returns the second last matching

Time:10-22

I have a table which collects daily readings of a total score from many different players. Since it's manual collection via form it may be that some players will add their reading more than once a day, and also can be a day or more without any reading at all. The structure is very basic 3 columns (Date, Player, Total).

I'm looking for an ArrayFormula that will automatically filling in a 4th column with the daily score of the specific player. This can achieve by a formula that finds the second-last reading of the specific player and subtract it from its last/current reading.

Date Player Total Daily
17/10/2021 Player 001 1500 1500
17/10/2021 Player 007 700 700
19/10/2021 Player 003 700 700
19/10/2021 Player 005 100 100
19/10/2021 Player 004 1100 1100
19/10/2021 Player 006 300 300
19/10/2021 Player 002 900 900
20/10/2021 Player 006 900 600
20/10/2021 Player 006 1600 700
20/10/2021 Player 002 1100 200
20/10/2021 Player 005 600 500
20/10/2021 Player 009 200 200
21/10/2021 Player 001 1600 100
21/10/2021 Player 003 1000 300

I found a very interesting solution, but since it's based on INDIRECT it can't work with ArrayFormula: enter image description here

CodePudding user response:

I'll offer a tentative solution, with the understanding that it's always difficult to write such a formula without the ability to see some actual data and the expected result.

Let's say your data is in A2:C (with headers in A1:C1). Try the following formula in D2 of an otherwise empty Col D:

=ArrayFormula(IF(A2:A="",,C2:C - (VLOOKUP(B2:B&(A2:A-1), SORT({ {"", 0}; {B2:B&A2:A, C2:C} }), 2, TRUE) * (VLOOKUP(B2:B&(A2:A-1), SORT({ {"", 0}; {B2:B&A2:A, B2:B} }), 2, TRUE) = B2:B))))

To find the second-to-last score per player, VLOOKUP looks up a concatenation of each row's player-and-"yesterday" within a SORTed virtual range containing A.) {null, 0} on top of B.) {a concatenation of each row's player-and-date, score}.

Because of the SORT, a final parameter of TRUE can be used, which means that if an exact match for player-and-"yesterday" is not found, the closest previous match will be returned. The * VLOOKUP(...) is there to make sure the previous match is for the same person (because the alphabetical entry prior to each person's earliest date will be someone else's last date, except for the first person alphabetically, who will bounce back to the {null, 0}).

However, if your sheet will always have at least one blank row below your data, you can simplify a bit:

=ArrayFormula(IF(A2:A="",,C2:C - (VLOOKUP(B2:B&(A2:A-1), SORT({B2:B&A2:A, C2:C}), 2, TRUE) * (VLOOKUP(B2:B&(A2:A-1), SORT({B2:B&A2:A, B2:B}), 2, TRUE) = B2:B))))

This is because the bounce-back for the first alphabetical person's first date will find {null, null} for all blank rows, which is equivalent to {null, 0}, all of which will be SORTed earlier than all of your data. So we don't need to include it in the virtual array setup.

If the result is not as expected, please share a minimal set of realistic data with the expected results.

ADDENDUM (per additional comment from OP):

If a player may enter more than one score per day, you can use the formula versions below.

If you're not sure you'll always have at least one blank row below your data:

=ArrayFormula(IF(A2:A="",,C2:C - (VLOOKUP(B2:B&TEXT(ROW(B2:B)-1,"0000"), SORT({ {"", 0}; {B2:B&TEXT(ROW(B2:B),"0000"), C2:C} }), 2, TRUE) * (VLOOKUP(B2:B&TEXT(ROW(B2:B)-1,"0000"), SORT({ {"", 0}; {B2:B&TEXT(ROW(B2:B),"0000"), B2:B} }), 2, TRUE) = B2:B))))

If you are sure you will always have at least one blank row below your data:

=ArrayFormula(IF(A2:A="",,C2:C - (VLOOKUP(B2:B&TEXT(ROW(B2:B)-1,"0000"), SORT( {B2:B&TEXT(ROW(B2:B),"0000"), C2:C} ), 2, TRUE) * (VLOOKUP(B2:B&TEXT(ROW(B2:B)-1,"0000"), SORT( {B2:B&TEXT(ROW(B2:B),"0000"), B2:B} ), 2, TRUE) = B2:B))))

Both of the above substitute row number for date. They assume, then, that your data will always be entered in the order they occurred in real time, not randomly (i.e., that you will not enter an earlier date's score after a later date's score). If you will potentially enter things out of order, this can also be controlled for; but I haven't done so here.

  • Related