Home > Software engineering >  Get unique rows by one column that are chosen by condition in other column in R
Get unique rows by one column that are chosen by condition in other column in R

Time:11-24

I have a dataframe of several columns i need to filter by one column value (let's call it col1) but i need to pick the row that has the least value in another column (e.g., col2). I know how to take distinct rows by a column value (basically, dplyr's distinct(col1)), but I'm not sure how it behaves when choosing the line to return from multiple lines, and I don't know how to guide it. For example, what I need is given this dataframe:

col1   col2
a      10  
b      12  
a      8
b      14
a      15
c      6
a      3

return the unique lines by col1 that have the least value in col2, i.e.:

col1   col2
a      3
b      12
c      6

CodePudding user response:

You can try the code below

> aggregate(. ~ col1, df, min)
  col1 col2
1    a    3
2    b   12
3    c    6

CodePudding user response:

Using dplyr the solution is to group by your column and then keep only rows with the minimum value in the other column:

df %>% group_by(col1) %>% filter(col2 == min(col2))

  • Related