Home > Back-end >  How to replace values of multiple columns with other columns within the same dataframe?
How to replace values of multiple columns with other columns within the same dataframe?

Time:01-25

I would like to update values of mulitple columns from other columns.

test <- data.frame(old1 = c(NA,NA,NA),
                   old2 = c(NA,NA,NA),
                   old3 = c(NA,NA,NA),
                 upd1 = c('1','2','3'),
                 upd2 = c('a','b','c'),
                 upd3 = c('I','II','III')
                 )

I want to update column old1 with values of upd1, old2 with upd2 and so on. One method is to do one by one, e.g use ifelse inside mutate(...) for each column. But this is not ideal as the columns are over 100. I want something like this:

test %>% 
  mutate(across(.cols = c(1:3),  ~replace(., is.na(.), c(upd1)) #this is wrong though.
                )
         )

This updates all columns with values of column upd1 but it can't accept mutiple columns inside the replace or ifelse function. Perhaps for loop may work but I am pretty new and don't have the expertise with loops.

Another method is to split the dataframe into two and use merge() and then aggregate() function. But it changes the type of certain columns and changes values of date columns to other formats.

Any tidyverse solution, please?

CodePudding user response:

Using dplyr::cur_column and get you could do:

library(dplyr, warn=FALSE)

test %>%
  mutate(across(
    .cols = starts_with("old"),
    ~ coalesce(.x, get(gsub("^old", "upd", cur_column())))
  ))
#>   old1 old2 old3 upd1 upd2 upd3
#> 1    1    a    I    1    a    I
#> 2    2    b   II    2    b   II
#> 3    3    c  III    3    c  III

CodePudding user response:

You can use across2 from dplyover:

library(dplyover)
test %>% 
  mutate(across2(starts_with("old"), starts_with("upd"),
                 ~ ifelse(is.na(.x), .y, .x), 
                 .names = "{xcol}"))

  old1 old2 old3 upd1 upd2 upd3
1    1    a    I    1    a    I
2    2    b   II    2    b   II
3    3    c  III    3    c  III

CodePudding user response:

replace() works with multiple columns. You can do:

library(dplyr)

test %>%
  mutate(replace(across(starts_with("old")), TRUE, across(starts_with("up"))))

  old1 old2 old3 upd1 upd2 upd3
1    1    a    I    1    a    I
2    2    b   II    2    b   II
3    3    c  III    3    c  III

You could also use modifyList() if you rename the up set to match the old set:

test %>%
  mutate(modifyList(across(starts_with("old")), across(starts_with("up"), .names = "{sub('^upd', 'old', col)}")))

  old1 old2 old3 upd1 upd2 upd3
1    1    a    I    1    a    I
2    2    b   II    2    b   II
3    3    c  III    3    c  III

CodePudding user response:

We may use

library(dplyr)
library(stringr)
test %>% 
  mutate(across(starts_with('old'),
   ~ coalesce(.x, cur_data()[[str_replace(cur_column(), "ol", "up")]])))

-output

   old1 old2 old3 upd1 upd2 upd3
1    1    a    I    1    a    I
2    2    b   II    2    b   II
3    3    c  III    3    c  III
  • Related