Home > Mobile >  Extract maximum (or last) values from a column based on the date in another column in Excel
Extract maximum (or last) values from a column based on the date in another column in Excel

Time:11-03

I have a huge excel file that I'm trying to sort by extracting the last value (column D) collected at the end of each day (column B). These data are coming from a rain gauge. Values in column D always increases during the day (but can also stay at 0) and goes back to 0 after midnight. Data look like this:

nb date mm avg
4669 23:39 23/9/2020 0.0 2.3
4670 23:44 23/9/2020 0.8 2.6
4671 23:49 23/9/2020 1.6 2.9
4672 23:54 23/9/2020 0.8 2.9
4673 23:59 23/9/2020 0.0 2.9
4674 0:04 24/9/2020 0.0 0.0
4675 0:09 24/9/2020 0.0 0.0
4769 23:39 24/9/2020 0.0 12.3
4770 23:44 24/9/2020 1.8 12.6
4771 23:49 24/9/2020 3.6 12.9
4772 23:54 24/9/2020 1.8 12.9
4773 23:59 24/9/2020 0.0 12.9
4774 0:04 25/9/2020 0.0 0.0
4775 0:09 25/9/2020 0.0 0.0

How can I do to simplify my table with only 1 value per day (last or maximum)?

Thank you

CodePudding user response:

I would put this through PowerQuery.

  1. Format as table
  2. Data > From Table/Range
  3. Change formatting on Column B from Date/Time to Date assuming this is recognised as a Date/Time (Home > Data type: Date)
  4. Order Column A, highest to lowest (Home > Sort > Z to A)
  5. Add an index column (Add Column > Index Column)
  6. Remove duplicates on Column B Date (Home > Remove Rows > Remove Duplicates)
  7. Delete the index column (Home > Remove Columns > Remove Column)

You now have one value, the latest update, for each date.

The reason for the index column is because if you don't do this, the data will not remove duplicates based on the order. This forces the cache to update and store the new order of highest to lowest in step 4.

e: data posted instead of picture so here is the actual PowerQuery of it;

let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"nb", Int64.Type}, {"date", type date}, {"mm", type number}, {"avg", type number}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"nb", Order.Descending}}),
    #"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1),
    #"Removed Duplicates" = Table.Distinct(#"Added Index", {"date"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates",{"Index"})
in
    #"Removed Columns"

To use this;

  1. Format as table
  2. Data > From Table/Range
  3. Home > Advanced Editor and paste in my code however,
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],

this line will change depending on what table this data is. So effectively just keep your line of this, not mine.

If this does not work due to the date not being a date, try the below code as it splits by space as a delimiter and then assigns the column as a date format;

let
    Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    #"Split Column by Delimiter" = Table.SplitColumn(Table.TransformColumnTypes(Source, {{"date", type text}}, "en-GB"), "date", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"date.1", "date.2"}),
    #"Renamed Columns" = Table.RenameColumns(#"Split Column by Delimiter",{{"date.1", "date"}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"date", type date}, {"nb", Int64.Type}, {"mm", type number}, {"avg", type number}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"nb", Order.Descending}}),
    #"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1),
    #"Removed Duplicates" = Table.Distinct(#"Added Index", {"date"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates",{"Index", "date.2"})
in
    #"Removed Columns"

CodePudding user response:

I concur that PowerQuery could be a very nice go-to solution for you. But it's also possible with formulas, for example:

enter image description here

Formula in F2:

=FILTER(A2:D15,B2:B15=XLOOKUP(INT(B2:B15),INT(B2:B15),B2:B15,,0,-1))
  • Related