Home > Software design >  Collapsing Columns in R using tidyverse with mutate, replace, and unite. Writing a function to reuse
Collapsing Columns in R using tidyverse with mutate, replace, and unite. Writing a function to reuse

Time:09-29

Data:

ID B C
1 NA x
2 x NA
3 x x

Results:

ID Unified
1 C
2 B
3 B_C

I'm trying to combine colums B and C, using mutate and unify, but how would I scale up this function so that I can reuse this for multiple columns (think 100 ), instead of having to write out the variables each time? Or is there a function that's already built in to do this?

My current solution is this:

library(tidyverse)

Data %>% 
mutate(B = replace(B, B == 'x', 'B'), C = replace(C, C == 'x', 'C')) %>%
unite("Unified", B:C, na.rm = TRUE, remove= TRUE)

Thank you so much!

CodePudding user response:

We may use across to loop over the column, replace the value that corresponds to 'x' with column name (cur_column())

library(dplyr)
library(tidyr)
Data %>%
    mutate(across(B:C, ~ replace(., .== 'x', cur_column()))) %>%
    unite(Unified, B:C, na.rm = TRUE, remove = TRUE)

-output

 ID Unified
1  1       C
2  2       B
3  3     B_C

data

Data <- structure(list(ID = 1:3, B = c(NA, "x", "x"), C = c("x", NA, 
"x")), class = "data.frame", row.names = c(NA, -3L))

CodePudding user response:

Here are couple of options.

  1. Using dplyr -
library(dplyr)

cols <- names(Data)[-1]

Data %>%
  rowwise() %>%
  mutate(Unified = paste0(cols[!is.na(c_across(B:C))], collapse = '_')) %>%
  ungroup -> Data

Data

#     ID B     C     Unified
#  <int> <chr> <chr> <chr>  
#1     1 NA    x     C      
#2     2 x     NA    B      
#3     3 x     x     B_C    
  1. Base R
Data$Unified <- apply(Data[cols], 1, function(x) 
                      paste0(cols[!is.na(x)], collapse = '_'))
  • Related