Home > Mobile >  how to create variable containing the age of another row inside a common id?
how to create variable containing the age of another row inside a common id?

Time:09-16

in the following example I show a table that contains information by household identification and the ranking of people within the household, age and gender. in addition, a variable called care that contains the ranking of the person who is cared for within the home and a variable called new with the age of the person cared for. that's my problem, I created the new variable in stata easily with the rangestat command, but I don't know how it is created in rstudio, I appreciate any solution.

id rank age sex care new
1 1 20 female 2 2
1 2 2 female NA NA
2 1 30 male 3 4
2 2 28 female NA NA
2 3 4 male NA NA
3 1 26 female 2 3
3 2 3 male NA NA
4 1 22 female NA NA
4 2 23 male 3 1
4 3 1 male NA NA

CodePudding user response:

You may use match. I created a new column called new1 to compare the output.

library(dplyr)

df <- df %>%
  group_by(id) %>%
  mutate(new1 = age[match(care, rank)]) %>%
  ungroup

df

#      id  rank   age sex     care   new  new1
#   <int> <int> <int> <chr>  <int> <int> <int>
# 1     1     1    20 female     2     2     2
# 2     1     2     2 female    NA    NA    NA
# 3     2     1    30 male       3     4     4
# 4     2     2    28 female    NA    NA    NA
# 5     2     3     4 male      NA    NA    NA
# 6     3     1    26 female     2     3     3
# 7     3     2     3 male      NA    NA    NA
# 8     4     1    22 female    NA    NA    NA
# 9     4     2    23 male       3     1     1
#10     4     3     1 male      NA    NA    NA

data

It is easier to help if you provide data in a reproducible format

df <- structure(list(id = c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 4L), 
    rank = c(1L, 2L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 3L), age = c(20L, 
    2L, 30L, 28L, 4L, 26L, 3L, 22L, 23L, 1L), sex = c("female", 
    "female", "male", "female", "male", "female", "male", "female", 
    "male", "male"), care = c(2L, NA, 3L, NA, NA, 2L, NA, NA, 
    3L, NA), new = c(2L, NA, 4L, NA, NA, 3L, NA, NA, 1L, NA)), 
row.names = c(NA, -10L), class = "data.frame")
  •  Tags:  
  • r
  • Related