I have a longitudinal data where respondents recruited as cohort. Right now, I have year in which they took the survey. But I want to create a new column simply counting if it is the first, second, or third time a person took the survey.
Original Table
PersonID | SurveyYear | SurveyQ1Rating | SurveyQ2Rating | Gender |
---|---|---|---|---|
12 | 2013 | 5 | 4 | f |
12 | 2012 | 4 | 4 | f |
12 | 2010 | 3 | 3 | f |
2 | 2007 | 4 | 4 | m |
2 | 2008 | 3 | 3 | m |
2 | 2009 | 3 | 5 | m |
2 | 2010 | 5 | 5 | m |
2 | 2013 | 2 | 2 | m |
5 | 2013 | 4 | 4 | f |
5 | 2014 | 5 | 5 | f |
Target Table (Where I created a new col SurveytTime to mark the ith time one took the survey)
PersonID | SurveyYear | SurveyTime | SurveyQ1Rating | SurveyQ2Rating | Gender |
---|---|---|---|---|---|
12 | 2013 | 3 | 5 | 4 | f |
12 | 2012 | 2 | 4 | 4 | f |
12 | 2010 | 1 | 3 | 3 | f |
2 | 2007 | 1 | 4 | 4 | m |
2 | 2008 | 2 | 3 | 3 | m |
2 | 2009 | 3 | 3 | 5 | m |
2 | 2010 | 4 | 5 | 5 | m |
2 | 2013 | 5 | 2 | 2 | m |
5 | 2013 | 1 | 4 | 4 | f |
5 | 2014 | 2 | 5 | 5 | f |
CodePudding user response:
A base
solution:
df |>
transform(SurveyTime = ave(SurveyYear, PersonID, FUN = rank))
Its dplyr
equivalent:
library(dplyr)
df %>%
group_by(PersonID) %>%
mutate(SurveyTime = dense_rank(SurveyYear)) %>%
ungroup()
Data
df <- structure(list(PersonID = c(12L, 12L, 12L, 2L, 2L, 2L, 2L, 2L,
5L, 5L), SurveyYear = c(2013L, 2012L, 2010L, 2007L, 2008L, 2009L,
2010L, 2013L, 2013L, 2014L), SurveyQ1Rating = c(5L, 4L, 3L, 4L,
3L, 3L, 5L, 2L, 4L, 5L), SurveyQ2Rating = c(4L, 4L, 3L, 4L, 3L,
5L, 5L, 2L, 4L, 5L), Gender = c("f", "f", "f", "m", "m", "m",
"m", "m", "f", "f")), class = "data.frame", row.names = c(NA, -10L))
CodePudding user response:
Using data.table
library(data.table)
setDT(df1)[, SurveyTime := frank(SurveyYear), PersonID]