I want to do a simple task: mutate the value of one column if that column exists. I want to do this for multiple columns and data frames within a loop.
I made a reproducible example. In the example, we have two data frames Belg
and Fran
.
Fran
has two columns t1
and t2
while Belg
has three t1
, t2
, t3
. I want to replace the values in these two or three columns based on a simple manipulation described in the code below. The tricky part is that because the number of columns is not the same on both data frames the code below generates a third column t3
to df Fran
-- which is not what I want.
df <- data.frame (v1 = c("Fran", "Fran", "Fran", "Belg", "Belg", "Belg"),
v2 = c(1201, 1201, 1202, 1203, 1204, 1205),
v3 = c(1, 2, 1, 1, 3, 1)
)
Fran_t <- data.frame(v1 = c("Fran", "Fran"),
t1 = c(NA, NA),
t2 = c(NA, NA)
)
Belg_t <- data.frame(v1 = c("Belg", "Belg", "Belg"),
t1 = c(NA, NA, NA),
t2 = c(NA, NA, NA),
t3 = c(NA, NA, NA)
)
cty <- c("Fran", "Belg")
for (c in cty) {
a <- df
d <- paste0(c,"_t")
a <- a %>%
filter(v1==paste(c)) %>%
left_join(., get(d), by="v1") %>%
group_by(v2) %>%
mutate("t1" = case_when(any(v3==1) ~ "NAv", TRUE ~ "NAp")) %>%
mutate("t2" = case_when(any(v3==2) ~ "NAv", TRUE ~ "NAp")) %>%
mutate("t3" = case_when(any(v3==3) ~ "NAv", TRUE ~ "NAp"))
assign(c, a)
}
I have tried things like:
mutate({if("t1" %in% names(.)) "t1" = case_when(any(v3==1) ~ "NAv") else "NULL"})
or even
{if("t1" %in% names(.)) mutate("t1" = case_when(any(v3==1) ~ "NAv", TRUE ~ "NAp")) else NULL}
``
CodePudding user response:
You can place Fran_t
and Belg_t
in a list. Note that this does not change the underlying frame Fran_t
and Belg_t
, but your original loop did not do that either; rather it created two new frames called Fran
and Belg
. This happened because of your assign(c,a)
call. If you want to actually change Fran_t
and Belg_t
, see an option at the bottom:
cty <- list(Fran_t, Belg_t)
lapply(cty, function(c) {
df%>%
inner_join(c, by="v1") %>%
group_by(v2) %>%
mutate(across(starts_with("t"), ~if_else(any(v3==1),"NAv", "NAp")))
})
Output:
[[1]]
# A tibble: 6 x 5
# Groups: v2 [2]
v1 v2 v3 t1 t2
<chr> <dbl> <dbl> <chr> <chr>
1 Fran 1201 1 NAv NAv
2 Fran 1201 1 NAv NAv
3 Fran 1201 2 NAv NAv
4 Fran 1201 2 NAv NAv
5 Fran 1202 1 NAv NAv
6 Fran 1202 1 NAv NAv
[[2]]
# A tibble: 9 x 6
# Groups: v2 [3]
v1 v2 v3 t1 t2 t3
<chr> <dbl> <dbl> <chr> <chr> <chr>
1 Belg 1203 1 NAv NAv NAv
2 Belg 1203 1 NAv NAv NAv
3 Belg 1203 1 NAv NAv NAv
4 Belg 1204 3 NAp NAp NAp
5 Belg 1204 3 NAp NAp NAp
6 Belg 1204 3 NAp NAp NAp
7 Belg 1205 1 NAv NAv NAv
8 Belg 1205 1 NAv NAv NAv
9 Belg 1205 1 NAv NAv NAv
Option to change Fran_t
and Belg_t
:
cty <- c("Fran_t", "Belg_t")
for(c in cty) {
assign(c, df%>%
inner_join(get(c), by="v1") %>%
group_by(v2) %>%
mutate(across(starts_with("t"), ~if_else(any(v3==1),"NAv", "NAp")))
)
}