Home > database >  Code values in new column based on whether values in another column are unique
Code values in new column based on whether values in another column are unique

Time:02-19

Given the following data I would like to create a new column new_sequence based on the condition: If only one id is present the new value should be 0. If several id's are present, the new value should numbered according to the values present in sequence.

dat <- tibble(id = c(1,2,3,3,3,4,4),
               sequence = c(1,1,1,2,3,1,2))

# A tibble: 7 x 2
     id sequence
  <dbl>    <dbl>
1     1        1
2     2        1
3     3        1
4     3        2
5     3        3
6     4        1
7     4        2

So, for the example data I am looking to produce the following output:

# A tibble: 7 x 3
     id sequence new_sequence
  <dbl>    <dbl>        <dbl>
1     1        1            0
2     2        1            0
3     3        1            1
4     3        2            2
5     3        3            3
6     4        1            1
7     4        2            2

I have tried with the code below, that does not work since all unique values are coded as 0

dat %>% mutate(new_sequence = ifelse(!duplicated(id), 0, sequence))

CodePudding user response:

Use dplyr::add_count() rather than !duplicated():

library(dplyr)

dat %>% 
  add_count(id) %>% 
  mutate(new_sequence = ifelse(n == 1, 0, sequence)) %>%
  select(!n)

Output:

# A tibble: 7 x 3
     id sequence new_sequence
  <dbl>    <dbl>        <dbl>
1     1        1            0
2     2        1            0
3     3        1            1
4     3        2            2
5     3        3            3
6     4        1            1
7     4        2            2

CodePudding user response:

You can also try the following. After grouping by id check if the number of rows in the group n() is 1 or not. Use separate if and else instead of ifelse since the lengths are different within each group.

dat %>%
  group_by(id) %>%
  mutate(new_sequence = if(n() == 1) 0 else sequence)

Output

     id sequence new_sequence
  <dbl>    <dbl>        <dbl>
1     1        1            0
2     2        1            0
3     3        1            1
4     3        2            2
5     3        3            3
6     4        1            1
7     4        2            2
  • Related