R tidyverse move certain rows to top or arrange by custom rules-CodePudding

I've tried different ways and searched for similar questions but no good luck.

I'd like to arrange and distinct my df in with a customized rule, where I keep only one row per group with the smallest val. But when 1 is availabile in val for example, I'd like to keep 1 instead of the smallest value.

val is the value column and ID is the ID column:

x = data.frame(ID=c("a", "a", 
                    "b", "b", 
                    "c", "c", 
                    "d", "d"),
               val=c(1, 2, 
                     0.5, 2,
                     1, 0.5,
                     5, 20))

x looks like:

  ID  val
1  a  1.0
2  a  2.0
3  b  0.5
4  b  2.0
5  c  1.0
6  c  0.5
7  d  5.0
8  d 20.0

I tried something like:

x %>% group_by(ID) %>% arrange(val) %>% distinct(ID, .keep_all = T) %>% arrange(ID)

and it gives me:

  ID      val
1 a       1  
2 b       0.5
3 c       0.5
4 d       5

Tried slice_min:

x %>%
  group_by(ID) %>%
  slice_min(order_by = tibble(val != 1, val), n = 1, with_ties = FALSE) %>%
  ungroup()

and it gives me:

# A tibble: 3 × 2
  ID      val
  <chr> <dbl>
1 a         1
2 c         1
3 d         5
Warning messages:
1: In xtfrm.data.frame(x) : cannot xtfrm data frames
2: In xtfrm.data.frame(x) : cannot xtfrm data frames
3: In xtfrm.data.frame(x) : cannot xtfrm data frames
4: In xtfrm.data.frame(x) : cannot xtfrm data frames

Desired output:

  ID      val
1 a       1  
2 b       0.5
3 c       1
4 d       5

CodePudding user response：

You can arrange by val != 1 and val and use slice_head() on your grouped data.

x %>%
  group_by(ID) %>%
  arrange(val != 1, val) %>%
  slice_head(n = 1) %>%
  ungroup()

# A tibble: 4 × 2
  ID      val
  <chr> <dbl>
1 a       1  
2 b       0.5
3 c       1  
4 d       5

Or using the development version of dplyr you can use slice_min() and make use of the order_by argument which can take multiple variables via tibble():

x %>%
  group_by(ID) %>%
  slice_min(order_by = tibble(val != 1, val), n = 1, with_ties = FALSE) %>%
  ungroup()