I have a dataset where loads of columns have NA
characters because they depend on another column response. For example :
df <- data.frame(X_home = c("Yes", "Yes", "No"),
X_school = c("No", "Yes", "No"),
Y_home = c("A", "B", NA),
Y_school = c(NA, "A", NA))
If I use mutate()
, I can do :
df %>% mutate(Y_home = if_else(X_home == "No", "C", Y_home),
Y_school = if_else(X_school == "No", "C", Y_school))
to obtain what I want. But the problem is that I have many X_something
and Y_something
. So I'd like to make it with something like that :
df %>% mutate(across(starts_with("Y_"), ~ if_else(...))
Is it possible ?
Thanks a lot.
CodePudding user response:
We can get
the value of the corresponding column after replacing the substring of column name (cur_column()
) using str_replace
library(dplyr)
library(stringr)
df %>%
mutate(across(starts_with("Y_"),
~ case_when(get(str_replace(cur_column(), "Y_", "X_")) == "No" ~
"C", TRUE ~ .x)))
-output
X_home X_school Y_home Y_school
1 Yes No A C
2 Yes Yes B A
3 No No C C
CodePudding user response:
Here is one possible approach with pivoting:
We identify groups of two in long format to imitate the corresponding column pairs.
Within these groups we mutate
across
school
and home
and re-pivot back:
library(tidyr)
library(dplyr)
df %>%
mutate(rn = row_number()) %>%
pivot_longer(cols = -rn, names_to = c( "grp", '.value'), names_sep = "\\_") %>%
group_by(rn) %>%
mutate(across(c(home, school), ~ if_else(is.na(.), "C",.))) %>%
pivot_wider(names_from = grp,
values_from = c(home, school),
names_glue = "{grp}_{.value}") %>%
ungroup() %>%
select(-rn)
X_home Y_home X_school Y_school
<chr> <chr> <chr> <chr>
1 Yes A No C
2 Yes B Yes A
3 No C No C