I have a dataframe that currently looks like this:
subjectID | Trial |
---|---|
1 | 3 |
1 | 3 |
1 | 3 |
1 | 4 |
1 | 4 |
1 | 5 |
1 | 5 |
1 | 5 |
2 | 1 |
2 | 1 |
2 | 3 |
2 | 3 |
2 | 3 |
2 | 5 |
2 | 5 |
2 | 6 |
3 | 1 |
Etc., where trial number is nested under subject ID. I need to make a new column in which column "NewTrial" is simply what order the trials now appear in. For example:
subjectID | Trial | NewTrial |
---|---|---|
1 | 3 | 1 |
1 | 3 | 1 |
1 | 3 | 1 |
1 | 4 | 2 |
1 | 4 | 2 |
1 | 5 | 3 |
1 | 5 | 3 |
1 | 5 | 3 |
2 | 1 | 1 |
2 | 1 | 1 |
2 | 3 | 2 |
2 | 3 | 2 |
2 | 3 | 2 |
2 | 5 | 3 |
2 | 5 | 3 |
2 | 6 | 4 |
3 | 1 | 1 |
So far, I have a for-loop written that looks like this:
for (myperson in unique(data$subjectID)){
#This line creates a vector of the number of unique trials per subject: for subject 1, c(1, 2, 3)
triallength=1:length(unique(data$Trial[data$subID==myperson]))
I'm having trouble now finding a way to paste the numbers from the created triallength
vector as a column in the dataframe. Does anyone know of a way to accomplish this? I am lacking some experience with for-loops and hoping to gain more. If anyone has a tidyverse/dplyr solution, however, I am open to that as well as an alternative to a for-loop. Thanks in advance, and let me know if any clarification is needed!
CodePudding user response:
We could use match
on the unique
values after grouping by 'subjectID'
library(dplyr)
df1 <- df1 %>%
group_by(subjectID) %>%
mutate(NewTrial = match(Trial, unique(Trial))) %>%
ungroup
CodePudding user response:
Converting to factor
with unique
values as levels, then as.numeric
in an ave
should be nice.
transform(dat, NewTrial=ave(Trial, subjectID, FUN=\(x) as.numeric(factor(x, levels=unique(x)))))
# subjectID Trial NewTrial
# 1 1 3 1
# 2 1 3 1
# 3 1 3 1
# 4 1 4 2
# 5 1 4 2
# 6 1 5 3
# 7 1 5 3
# 8 1 5 3
# 9 2 1 1
# 10 2 1 1
# 11 2 3 2
# 12 2 3 2
# 13 2 3 2
# 14 2 5 3
# 15 2 5 3
# 16 2 6 4
# 17 3 1 1
Data:
dat <- structure(list(subjectID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L), Trial = c(3L, 3L, 3L, 4L,
4L, 5L, 5L, 5L, 1L, 1L, 3L, 3L, 3L, 5L, 5L, 6L, 1L)), class = "data.frame", row.names = c(NA,
-17L))
CodePudding user response:
We could use rleid
:
library(dplyr)
library(data.table)
df %>%
group_by(subjectID) %>%
mutate(NewTrial = rleid(subjectID, Trial))
subjectID Trial NewTrial
<int> <int> <int>
1 1 3 1
2 1 3 1
3 1 3 1
4 1 4 2
5 1 4 2
6 1 5 3
7 1 5 3
8 1 5 3
9 2 1 1
10 2 1 1
11 2 3 2
12 2 3 2
13 2 3 2
14 2 5 3
15 2 5 3
16 2 6 4
17 3 1 1