Home > front end >  Create unique row values in new column based on matching criteria in R
Create unique row values in new column based on matching criteria in R

Time:05-06

I have a dataframe with one identifier column of unique values, and one column which contains specific criteria.

I want to create a new identifier column of unique values, but where the value also contains information about which criteria it meets. In the example below, I have used case_when() and seq_along() to accomplish this:

set.seed(1)
df <- data.frame(
    ID = LETTERS[1:10],
    Criteria = paste0("Crit ", floor(runif(10, min=1, max=4)))
)
df %>%
mutate(
    ID2 = case_when(
        Criteria == "Crit 1" ~ paste0("x", seq_along(Criteria)),
        Criteria == "Crit 2" ~ paste0("y", seq_along(Criteria)),
        Criteria == "Crit 3" ~ paste0("z", seq_along(Criteria))
    )
)

Output:

A data.frame: 10 × 3
ID  Criteria ID2
A   c1       x1
B   c2       y2
C   c2       y3
D   c3       z4
E   c1       x5
F   c3       z6
G   c3       z7
H   c2       y8
I   c2       y9
J   c1       x10

The new column, ID2, now has row values that are both unique (numbers 1 to 10) and where the criteria can be identified (letters x, y and z). However, seq_along() inserts a new number for each row regardless of criterion. I'd rather that the count starts anew at one for each criterion. (Eg. for criterion c1: x1, x2, x3, ..., xn; for c2: y1, y2, y3, ..., ym; etc.)

What I want:

A data.frame: 10 × 3
ID  Criteria ID2
A   c1       x1
B   c2       y1
C   c2       y2
D   c3       z1
E   c1       x2
F   c3       z2
G   c3       z3
H   c2       y3
I   c2       y4
J   c1       x3

CodePudding user response:

You can just add group_by(Criteria):

library(dplyr)

df %>%
  group_by(Criteria) %>%
  mutate(
    ID2 = case_when(
      Criteria == "Crit 1" ~ paste0("x", seq_along(Criteria)),
      Criteria == "Crit 2" ~ paste0("y", seq_along(Criteria)),
      Criteria == "Crit 3" ~ paste0("z", seq_along(Criteria))
    )
  )

Output:

# A tibble: 10 × 3
# Groups:   Criteria [3]
   ID    Criteria ID2  
   <chr> <chr>    <chr>
 1 A     Crit 1   x1   
 2 B     Crit 2   y1   
 3 C     Crit 2   y2   
 4 D     Crit 3   z1   
 5 E     Crit 1   x2   
 6 F     Crit 3   z2   
 7 G     Crit 3   z3   
 8 H     Crit 2   y3   
 9 I     Crit 2   y4   
10 J     Crit 1   x3 
  • Related