Assign value to column based on another character column-CodePudding

I have a data frame that look like this:

df <- data.frame(subject = c(a1_1, a1_1, a1_1, a1_1, a1_2, a1_2, b1_1, b1_1), group = c(1, NA, NA, NA, NA, 1, NA, NA, 2, NA)

as you can see only the first entry of every subject has a group assigned. My idea is fill the blank spaces of every subject with the group number(e.g. all a1_1 must have a group 1 value).

Thanks for your help!

CodePudding user response：

We may do

df$group <- match(df$subject, unique(df$subject))

-output

> df
   subject group
1     a1_1     1
2     a1_1     1
3     a1_1     1
4     a1_1     1
5     a1_2     2
6     a1_2     2
7     b1_1     3
8     b1_1     3
9     b1_1     3
10    b1_1     3

data

df <- structure(list(subject = c("a1_1", "a1_1", "a1_1", "a1_1", "a1_2", 
"a1_2", "b1_1", "b1_1", "b1_1", "b1_1")), class = "data.frame", row.names = c(NA, 
-10L))

CodePudding user response：

If your data is always structured the same way, you can pull the group assignment out of the subject information:

library(stringr)
df <- data.frame(subject = c(rep('a1_1', 4), rep('a1_2', 2), rep('b1_1',4)))
df$group <- str_sub(df$subject, -1)

The str_sub pulls out the last element of the character to assign to the group, assuming it's the last character you want.