Re-label Year ID data nested in person to time point (enumerate time point)-CodePudding

I have a longitudinal data where respondents recruited as cohort. Right now, I have year in which they took the survey. But I want to create a new column simply counting if it is the first, second, or third time a person took the survey.

Original Table

PersonID	SurveyYear	SurveyQ1Rating	SurveyQ2Rating	Gender
12	2013	5	4	f
12	2012	4	4	f
12	2010	3	3	f
2	2007	4	4	m
2	2008	3	3	m
2	2009	3	5	m
2	2010	5	5	m
2	2013	2	2	m
5	2013	4	4	f
5	2014	5	5	f

Target Table (Where I created a new col SurveytTime to mark the ith time one took the survey)

PersonID	SurveyYear	SurveyTime	SurveyQ1Rating	SurveyQ2Rating	Gender
12	2013	3	5	4	f
12	2012	2	4	4	f
12	2010	1	3	3	f
2	2007	1	4	4	m
2	2008	2	3	3	m
2	2009	3	3	5	m
2	2010	4	5	5	m
2	2013	5	2	2	m
5	2013	1	4	4	f
5	2014	2	5	5	f

CodePudding user response：

A base solution:

df |>
  transform(SurveyTime = ave(SurveyYear, PersonID, FUN = rank))

Its dplyr equivalent:

library(dplyr)

df %>%
  group_by(PersonID) %>%
  mutate(SurveyTime = dense_rank(SurveyYear)) %>%
  ungroup()

Data

df <- structure(list(PersonID = c(12L, 12L, 12L, 2L, 2L, 2L, 2L, 2L, 
5L, 5L), SurveyYear = c(2013L, 2012L, 2010L, 2007L, 2008L, 2009L, 
2010L, 2013L, 2013L, 2014L), SurveyQ1Rating = c(5L, 4L, 3L, 4L, 
3L, 3L, 5L, 2L, 4L, 5L), SurveyQ2Rating = c(4L, 4L, 3L, 4L, 3L, 
5L, 5L, 2L, 4L, 5L), Gender = c("f", "f", "f", "m", "m", "m", 
"m", "m", "f", "f")), class = "data.frame", row.names = c(NA, -10L))

CodePudding user response：

Using data.table

library(data.table)
setDT(df1)[, SurveyTime := frank(SurveyYear), PersonID]