How to find the greatest value per ID in Microsoft Excel Power Query?-CodePudding

JOB_ID	DAY_ID
430	120
430	720
530	120
530	620
630	120
630	120
630	620
630	620

Should become:

JOB_ID	DAY_ID
430	720
530	620
630	620
630	620

Where only the rows where day_id is the maximum value is kept.

I'm currently using Excel's Power Query, so Python and R scripts are not available.

CodePudding user response：

To keep the duplicates at the end, you have to fiddle around with the M. Here you go.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjE2UFDSUTI0MlCK1YFzzaFcU1RZKNcMyjVDlcXONcPCjQUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"JOB_ID " = _t, DAY_ID = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"JOB_ID ", Int64.Type}, {"DAY_ID", Int64.Type}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"JOB_ID "}, {{"All", each 

    let a = List.Max(_[DAY_ID]),
    b =  List.Select(_[DAY_ID], each _ =   a)
    in b

    , type list}}),
    #"Expanded All" = Table.ExpandListColumn(#"Grouped Rows", "All")
in
    #"Expanded All"

CodePudding user response：

You may want to group by Job_ID and choose to aggregate max Day_ID

edit: (see below comment) matching against original list allows to keep duplicates of Job_ID with same max Day_ID