Home > Net >  Keep only row with lowest value for two column for each ID
Keep only row with lowest value for two column for each ID

Time:01-06

For each ID find the lowest value in column C and D and delete other rows.

I want solution only because I want to build-up on this code to add more criteria.



Criteria is:

Each ID find the lowest value1 and keep that entire row also for each ID find lowest value2 and keep that entire row.

if both lowest value for value1 and value2 are in the same row then delete other rows.

enter image description here

I have tried modifying the code below but it only works for 1 column duplicate.

One solution I was thinking is to highlight entire row with lowest value for each ID for Value1 and Value2 then delete the unhighlighted rows.

Sub test()
ActiveSheet.Range("A:D").RemoveDuplicates Columns:=Array(3, 4), Header:=xlYes
End Sub


CodePudding user response:

Okay So I HAVE found a solution... but it might be more complicated than doing it manually. That depends on your data size and how frequently you need to do this.

0) Remove Full Line Duplicates:
Filter for All enter image description here

1) Sum your Value 1 and value 2:
enter image description here

Then, Sort by V1 V2, then re-sort by ID number.
This will make sure we only return 1, of there are two single minimums.
(Example at end of answer)

2) Check if Value 1 and Value 2 are the lowest value for ID number:
=C2=MIN(FILTER($C$2:$C$24,$A$2:$A$24=A2))
=D2=MIN(FILTER($D$2:$D$24,$A$2:$A$24=A2))
enter image description here
enter image description here

3) Ran AND() operation to see if both are min. enter image description here

4) Run OR() operation to see if either are min.
=IF(SUM(FILTER($H$2:$H$24,$A$2:$A$24=A2))>0,0,IF(SUM(FILTER($I$1:I1,$A$2:A2=A2))<2,IF(OR(F2:G2),1,0),0))
enter image description here

5) Finally, filter list for items that have either AND or and OR :
=FILTER(A2:D24,(H2:H24=1) (I2:I24=1))
enter image description here


To note, if you do not sort, it may pull the wrong ones. This is sorted:
enter image description here
This is unsorted:
enter image description here

CodePudding user response:

Again, this is something that can also be solved in Power Query.

To use Power Query

  • Select some cell in your Data Table
  • Data => Get&Transform => from Table/Range
  • When the PQ Editor opens: Home => Advanced Editor
  • Make note of the Table Name in Line 2
  • Paste the M Code below in place of what you see
  • Change the Table name in line 2 back to what was generated originally.
  • Read the comments and explore the Applied Steps to understand the algorithm

M Code

let

//change next line to reflect actual data source
    Source = Excel.CurrentWorkbook(){[Name="Table33"]}[Content],

//set data types
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Name", type text}, {"Value1", type number}}),

//Group by ID and Name
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID", "Name"}, {
        {"Values", (t)=>  
            let
                #"Min V1 Rows" = List.PositionOf(t[Value1], List.Min(t[Value1]), Occurrence.All),
                #"Min V2 Rows" = List.PositionOf(t[Value2], List.Min(t[Value2]), Occurrence.All),

/*If Rows are the same for both Min Value1 and Min Value2, then #"Same Row" will be non null
     so we can just return a single row
  If Intersect List is null, then we need to return separate table rows for Value1 and Value2*/
                #"Same Row" = List.Intersect({#"Min V1 Rows", #"Min V2 Rows"})            
            in 
                if List.Count(#"Same Row") > 0
                    then Table.FromRecords({t{#"Same Row"{0}}}, {"Value1","Value2"})
                else Table.FromRecords({t{#"Min V1 Rows"{0}}, t{#"Min V2 Rows"{0}}}, {"Value1","Value2"})
                ,type table[Value1=number, Value2=number]}}),
    #"Expanded Values" = Table.ExpandTableColumn(#"Grouped Rows", "Values", {"Value1", "Value2"})
in
    #"Expanded Values"

enter image description here

  • Related