I'm new here so apologies if I make a couple mistakes here but essentially, I am trying to recode a lot of columns that have to do with the same content but at different time points. Therefore, I'm trying to find a way to recode multiple columns that have the same corresponding responses (if that makes sense). Hopefully this makes sense but for example, if I was trying to gather peoples' fruit consumption at 5 different time points but I wanted to only focus on 4 types of fruit - apple (1), banana (2), orange (3), and strawberry (4) and this was the data I had:
id Fruit_T1 Fruit_T2 Fruit_T3 Fruit_T4
1 1 apple banana apple kiwi
2 2 banana apple strawberry <NA>
3 3 orange strawberry kiwi apple
4 4 strawberry orange <NA> <NA>
5 5 banana banana apple apple
6 6 orange apple strawberry apricot
I am trying to get to this:
id Fruit_T1 Fruit_T2 Fruit_T3 Fruit_T4 RFruit_T1 RFruit_T2 RFruit_T3 RFruit_T4
1 1 apple banana apple kiwi 1 2 1 .
2 2 banana apple strawberry <NA> 2 1 4 <NA>
3 3 orange strawberry kiwi apple 3 4 . 1
4 4 strawberry orange <NA> <NA> 4 3 <NA> <NA>
5 5 banana banana apple apple 2 2 1 4
6 6 orange apple strawberry apricot 3 1 4 .
Where the "NA"'s are distinct from the '.' where the individual did consume fruit but it was not one of the 4 of interest. Sorry for the dumb example but I really appreciate any insight into this situation. Thanks very much!
CodePudding user response:
Using recode
with across
you could do:
library(dplyr)
rec_vec <- c(apple = "1", banana = "2", orange = "3", strawberry = "4")
dat |>
mutate(across(!id, recode, !!!rec_vec, .default = ".", .names = "R{.col}"))
#> id Fruit_T1 Fruit_T2 Fruit_T3 Fruit_T4 RFruit_T1 RFruit_T2 RFruit_T3
#> 1 1 apple banana apple kiwi 1 2 1
#> 2 2 banana apple strawberry <NA> 2 1 4
#> 3 3 orange strawberry kiwi apple 3 4 .
#> 4 4 strawberry orange <NA> <NA> 4 3 <NA>
#> 5 5 banana banana apple apple 2 2 1
#> 6 6 orange apple strawberry apricot 3 1 4
#> RFruit_T4
#> 1 .
#> 2 <NA>
#> 3 1
#> 4 <NA>
#> 5 1
#> 6 .
DATA
dat <- structure(list(id = 1:6, Fruit_T1 = c(
"apple", "banana", "orange",
"strawberry", "banana", "orange"
), Fruit_T2 = c(
"banana", "apple",
"strawberry", "orange", "banana", "apple"
), Fruit_T3 = c(
"apple",
"strawberry", "kiwi", NA, "apple", "strawberry"
), Fruit_T4 = c(
"kiwi",
NA, "apple", NA, "apple", "apricot"
)), row.names = c(
"1", "2",
"3", "4", "5", "6"
), class = "data.frame")