Home > Enterprise >  How to consolidate duplicates and take min/max values from other columns?
How to consolidate duplicates and take min/max values from other columns?

Time:07-16

I want to consolidate duplicates in one column and take minimum/maximum from two other columns: For example, from this file:

100627108.00    100636806.00    x
100627109.00    100637104.00    x
100632063.00    100637104.00    x
100632541.00    100636689.00    x
100633442.00    100639991.00    x
27612064.00 27635185.00 y
27612292.00 27626240.00 y
27612300.00 27624062.00 y
27612669.00 27626569.00 y
27615514.00 27626135.00 y

I want to consolidate duplicates in the third column (x and y) and take corresponding minimum value from first column and maximum value from second column for each set of duplicates to get something like:

100627108.00    100639991.00    x
27612064.00 27635185.00 y

I have thousands of these in a file.

Thanks

CodePudding user response:

Assume the names for the columns are V1, V2, V3 then we can do:

df %>%
   group_by(V3)%>%
   summarise(mn = min(V1), mx=max(V2))

# A tibble: 2 x 3
  V3           mn        mx
  <chr>     <dbl>     <dbl>
1 x     100627108 100639991
2 y      27612064  27635185
  • Related