How to make a function that fills empty rows in one column with the values of those in another colum-CodePudding

I want to make function that takes this data

now      changed   before
"12ab"   "yes"     "21ba"
"34de"   "no"      
"56fg"   "yes"     "gf65"
"78hi"   "no"      NA

And turn it into

now      changed   before
"12ab"   "yes"     "21ba"
"34de"   "no"      "34de"
"56fg"   "yes"     "gf65"
"78hi"   "no"      "78hi"

So if before is empty, I want before to take the value of now (with the assumption that if it didn't change, it must have been the same.

I want to use a function as I want to apply it to more column pairs.

I tried this:

library(purrr)
library(dplyr)
fun <- function(data, x, y) {
     coalesce(case_when(data[[y]] == NA | data[[y]] == '' ~ data[[x]], data[[y]])
}
df[c("before", "before1")] <- map2(c("now", "now1"),c("before", "before1") ~  fun(df, .x, .y))

But it doesn't do anything.

CodePudding user response：

You can convert empty string to NA with dplyr::na_if and coalesce with dplyr::coalesce:

library(dplyr)
df %>% 
  na_if("") %>% 
  mutate(before = coalesce(before, now))

#    now changed before
# 1 12ab     yes   21ba
# 2 34de      no   34de
# 3 56fg     yes   gf65
# 4 78hi      no   78hi

As a function, you could have:

f <- function(data, x, y){
  data %>% 
    na_if("") %>% 
    mutate(before = coalesce({{x}}, {{y}}))
}

f(df, before, now)

CodePudding user response：

See my comment above about defining na.strings when reading in your data. Then you can use base R to fill in the missing data:

df$before[is.na(df$before)] <- df$now[is.na(df$before)]