I'm trying to reverse-recode values (i.e. 1 into 5, 5 into 1, etc.) only on a subset of my participant data while keeping all rows: for those people that have indicated a native language English. As my dataset is quite large, I want to avoid splitting it into 2 datasets (people whose first language is English, and those with other first languages) and then trying to copy-paste the results by participant ID back into one dataframe.
Here's a small example to illustrate it:
data1 <- data.frame(primary_school=c(1,2,1,3,4,5,2,1,2,1,3,1,3,3,1,1,4,2,5,1), high_school=c(1,2,3,4,5,1,2,1,1,3,1,3,1,2,3,3,4,2,1,2), relatives=c(1,2,3,4,5,5,2,5,5,3,1,3,5,2,3,3,4,2,1,5),home=c(3,2,3,3,4,5,3,3,2,1,3,1,3,3,3,1,3,2,3,3), siblings=c(1,1,1,4,1,1,2,1,1,3,1,1,1,1,1,1,4,2,1,1), Language_A=c("English","English","English","Tamil","French","Malay","Romanian","English","Quechua","Zapotec", "English","English","English","Tamil","French","Malay","Romanian","English","Quechua","Zapotec"),L1=c("English","English","English","Tamil","French","English","English","English","Quechua","Zapotec","English","English","English","Tamil","French","Malay","Romanian","English","Quechua","Zapotec"))
> data1
primary_school high_school relatives home siblings Language_A L1
1 1 1 1 3 1 English English
2 2 2 2 2 1 English English
3 1 3 3 3 1 English English
4 3 4 4 3 4 Tamil Tamil
5 4 5 5 4 1 French French
6 5 1 5 5 1 Malay English
7 2 2 2 3 2 Romanian English
8 1 1 5 3 1 English English
9 2 1 5 2 1 Quechua Quechua
10 1 3 3 1 3 Zapotec Zapotec
11 3 1 1 3 1 English English
12 1 3 3 1 1 English English
13 3 1 5 3 1 English English
14 3 2 2 3 1 Tamil Tamil
15 1 3 3 3 1 French French
16 1 3 3 1 1 Malay Malay
17 4 4 4 3 4 Romanian Romanian
18 2 2 2 2 2 English English
19 5 1 1 3 1 Quechua Quechua
20 1 2 5 3 1 Zapotec Zapotec
What I first tried was using filter
but soon found out it does only subset and split from the dataset the sample of people whose L1 is English (thus satisfying (!(Language_A =="English" | L1 == "English")
), while I'd like to keep all rows:
testtest<- data1 %>%
filter(!(Language_A =="English" | L1 == "English")) %>%
mutate_at(c("primary_school","high_school", "siblings","relatives","home"),
funs(recode(., "1"=5,"2"=4, "3"=3, "4"=2, "5"=1)))
Is there any function that works similarly but keeps all of the data?
I also tried something like the below but it seems it's not happy with the arguments I want to use.
testtest<- data1 %>%
if (Language_A !="English" | L1 != "English"){
mutate_at(c("primary_school","high_school", "siblings","relatives","home"),
funs(recode(., "1"=5,"2"=4, "3"=3, "4"=2, "5"=1)))
} else ()
I saw people resolving similar issues using case_when
, but it seems it is mostly applied to mutating a single value into another single value, under different cases. So I'm not sure how I could even apply this for mutating multiple values under a single case.
Any ideas would be very appreciated. Thanks!
CodePudding user response:
We may use across
(_at/_all
are deprecated in favor of across
) to loop over those columns that needs recoding. Then, based on the logic ie. whereever Language_A and L1 are both not 'English', subtract the values from 6 (6- 1 = 5, 6-2= 4, 6-3 = 3, 6-4 = 2, 6-5 = 1 - assuming only values within 1-5 are in each of those columns) or else return the column value
library(dplyr)
data1 %>%
mutate(across(primary_school:siblings,
~ case_when(!(Language_A =="English" | L1 == "English") ~ 6 - .x, TRUE ~ .x)))
-output
primary_school high_school relatives home siblings Language_A L1
1 1 1 1 3 1 English English
2 2 2 2 2 1 English English
3 1 3 3 3 1 English English
4 3 2 2 3 2 Tamil Tamil
5 2 1 1 2 5 French French
6 5 1 5 5 1 Malay English
7 2 2 2 3 2 Romanian English
8 1 1 5 3 1 English English
9 4 5 1 4 5 Quechua Quechua
10 5 3 3 5 3 Zapotec Zapotec
11 3 1 1 3 1 English English
12 1 3 3 1 1 English English
13 3 1 5 3 1 English English
14 3 4 4 3 5 Tamil Tamil
15 5 3 3 3 5 French French
16 5 3 3 5 5 Malay Malay
17 2 2 2 3 2 Romanian Romanian
18 2 2 2 2 2 English English
19 1 5 5 3 5 Quechua Quechua
20 5 4 1 3 5 Zapotec Zapotec
CodePudding user response:
Here is a similar dplyr solution using ifelse
:
library(dplyr)
data1 %>%
mutate(across(-c(Language_A, L1), ~ifelse(Language_A=="English" |
L1 == "English", ., 6-.)))
primary_school high_school relatives home siblings Language_A L1
1 1 1 1 3 1 English English
2 2 2 2 2 1 English English
3 1 3 3 3 1 English English
4 3 2 2 3 2 Tamil Tamil
5 2 1 1 2 5 French French
6 5 1 5 5 1 Malay English
7 2 2 2 3 2 Romanian English
8 1 1 5 3 1 English English
9 4 5 1 4 5 Quechua Quechua
10 5 3 3 5 3 Zapotec Zapotec
11 3 1 1 3 1 English English
12 1 3 3 1 1 English English
13 3 1 5 3 1 English English
14 3 4 4 3 5 Tamil Tamil
15 5 3 3 3 5 French French
16 5 3 3 5 5 Malay Malay
17 2 2 2 3 2 Romanian Romanian
18 2 2 2 2 2 English English
19 1 5 5 3 5 Quechua Quechua
20 5 4 1 3 5 Zapotec Zapotec