in the following example I show a table that contains information by household identification and the ranking of people within the household, age and gender. in addition, a variable called care that contains the ranking of the person who is cared for within the home and a variable called new with the age of the person cared for. that's my problem, I created the new variable in stata easily with the rangestat command, but I don't know how it is created in rstudio, I appreciate any solution.
id | rank | age | sex | care | new |
---|---|---|---|---|---|
1 | 1 | 20 | female | 2 | 2 |
1 | 2 | 2 | female | NA | NA |
2 | 1 | 30 | male | 3 | 4 |
2 | 2 | 28 | female | NA | NA |
2 | 3 | 4 | male | NA | NA |
3 | 1 | 26 | female | 2 | 3 |
3 | 2 | 3 | male | NA | NA |
4 | 1 | 22 | female | NA | NA |
4 | 2 | 23 | male | 3 | 1 |
4 | 3 | 1 | male | NA | NA |
CodePudding user response:
You may use match
. I created a new column called new1
to compare the output.
library(dplyr)
df <- df %>%
group_by(id) %>%
mutate(new1 = age[match(care, rank)]) %>%
ungroup
df
# id rank age sex care new new1
# <int> <int> <int> <chr> <int> <int> <int>
# 1 1 1 20 female 2 2 2
# 2 1 2 2 female NA NA NA
# 3 2 1 30 male 3 4 4
# 4 2 2 28 female NA NA NA
# 5 2 3 4 male NA NA NA
# 6 3 1 26 female 2 3 3
# 7 3 2 3 male NA NA NA
# 8 4 1 22 female NA NA NA
# 9 4 2 23 male 3 1 1
#10 4 3 1 male NA NA NA
data
It is easier to help if you provide data in a reproducible format
df <- structure(list(id = c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 4L),
rank = c(1L, 2L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 3L), age = c(20L,
2L, 30L, 28L, 4L, 26L, 3L, 22L, 23L, 1L), sex = c("female",
"female", "male", "female", "male", "female", "male", "female",
"male", "male"), care = c(2L, NA, 3L, NA, NA, 2L, NA, NA,
3L, NA), new = c(2L, NA, 4L, NA, NA, 3L, NA, NA, 1L, NA)),
row.names = c(NA, -10L), class = "data.frame")