Home > Back-end >  Pasting values from a vector to a new column in a for loop with nested data
Pasting values from a vector to a new column in a for loop with nested data

Time:08-30

I have a dataframe that currently looks like this:

subjectID Trial
1 3
1 3
1 3
1 4
1 4
1 5
1 5
1 5
2 1
2 1
2 3
2 3
2 3
2 5
2 5
2 6
3 1

Etc., where trial number is nested under subject ID. I need to make a new column in which column "NewTrial" is simply what order the trials now appear in. For example:

subjectID Trial NewTrial
1 3 1
1 3 1
1 3 1
1 4 2
1 4 2
1 5 3
1 5 3
1 5 3
2 1 1
2 1 1
2 3 2
2 3 2
2 3 2
2 5 3
2 5 3
2 6 4
3 1 1

So far, I have a for-loop written that looks like this:

for (myperson in unique(data$subjectID)){

#This line creates a vector of the number of unique trials per subject: for subject 1, c(1, 2, 3)
triallength=1:length(unique(data$Trial[data$subID==myperson]))

I'm having trouble now finding a way to paste the numbers from the created triallength vector as a column in the dataframe. Does anyone know of a way to accomplish this? I am lacking some experience with for-loops and hoping to gain more. If anyone has a tidyverse/dplyr solution, however, I am open to that as well as an alternative to a for-loop. Thanks in advance, and let me know if any clarification is needed!

CodePudding user response:

We could use match on the unique values after grouping by 'subjectID'

library(dplyr)
df1 <- df1 %>% 
  group_by(subjectID) %>%
  mutate(NewTrial = match(Trial, unique(Trial))) %>%
  ungroup

CodePudding user response:

Converting to factor with unique values as levels, then as.numeric in an ave should be nice.

transform(dat, NewTrial=ave(Trial, subjectID, FUN=\(x) as.numeric(factor(x, levels=unique(x)))))
#    subjectID Trial NewTrial
# 1          1     3        1
# 2          1     3        1
# 3          1     3        1
# 4          1     4        2
# 5          1     4        2
# 6          1     5        3
# 7          1     5        3
# 8          1     5        3
# 9          2     1        1
# 10         2     1        1
# 11         2     3        2
# 12         2     3        2
# 13         2     3        2
# 14         2     5        3
# 15         2     5        3
# 16         2     6        4
# 17         3     1        1

Data:

dat <- structure(list(subjectID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L), Trial = c(3L, 3L, 3L, 4L, 
4L, 5L, 5L, 5L, 1L, 1L, 3L, 3L, 3L, 5L, 5L, 6L, 1L)), class = "data.frame", row.names = c(NA, 
-17L))

CodePudding user response:

We could use rleid:

library(dplyr)
library(data.table)
df %>% 
  group_by(subjectID) %>% 
  mutate(NewTrial = rleid(subjectID, Trial))
  subjectID Trial NewTrial
       <int> <int>    <int>
 1         1     3        1
 2         1     3        1
 3         1     3        1
 4         1     4        2
 5         1     4        2
 6         1     5        3
 7         1     5        3
 8         1     5        3
 9         2     1        1
10         2     1        1
11         2     3        2
12         2     3        2
13         2     3        2
14         2     5        3
15         2     5        3
16         2     6        4
17         3     1        1
  • Related