I have a table with parent and his or her child's info(name, grade, age). One child can have multiple parents and one parents can have multiple children. I want to reshape the table so each row will be an unique parent with all his or her children's information labeled as .1, ,2 etc.
Original Table:
ParentID | ChildN | ChildGrade | ChildAge | HouseholdID |
---|---|---|---|---|
1 | x | 10 | 21 | a |
2 | x | 10 | 21 | a |
3 | z | 12 | 23 | b |
1 | y | 13 | 24 | a |
2 | y | 13 | 24 | a |
3 | t | 15 | 26 | b |
4 | g | 16 | 27 | c |
Goal:
ParentID | ChildN.1 | ChildGrade.1 | ChildAge.1 | ChildN.2 | ChildGrade.2 | ChildAge.2 | HouseholdID |
---|---|---|---|---|---|---|---|
1 | x | 10 | 21 | y | 13 | 24 | a |
2 | x | 10 | 21 | y | 13 | 24 | a |
3 | z | 12 | 23 | t | 15 | 26 | b |
4 | g | 16 | 27 | NA | NA | NA | c |
I want to know how to achieve this in R. Thanks a lot.
CodePudding user response:
You could achieve your desired result using tidyr::pivot_wider
by first adding an id column for the children:
library(dplyr)
library(tidyr)
df |>
group_by(ParentID) |>
mutate(child_id = row_number()) |>
pivot_wider(values_from = c("ChildN", "ChildGrade", "ChildAge"), names_from = "child_id",
names_glue = "{.value}.{child_id}")
#> # A tibble: 4 × 8
#> # Groups: ParentID [4]
#> ParentID HouseholdID ChildN.1 ChildN.2 ChildGrade.1 ChildGrade.2 ChildAge.1
#> <int> <chr> <chr> <chr> <int> <int> <int>
#> 1 1 a x y 10 13 21
#> 2 2 a x y 10 13 21
#> 3 3 b z t 12 15 23
#> 4 4 c g <NA> 16 NA 27
#> # … with 1 more variable: ChildAge.2 <int>