Home > OS >  Stack Separate Columns with Same Identifier R
Stack Separate Columns with Same Identifier R

Time:11-18

Is it possible to merge columns with the same identifier and stack these on top of one another from the same dataframe. Also to then remove NA values as seen in the Desired DataFrame output?

Current DataFrame

id    Col1   Col2   Col3   
1     A      NA     C   
1     NA     B      NA       
2     NA     B      NA  
2     A      NA     C 
3     A      NA     C   

Desired DataFrame

id    NewCol       
1     A    
1     B      
1     C   
2     A    
2     B
2     C
3     A
3     C

CodePudding user response:

Reshape from 'wide' to 'long' with pivot_longer while removing the NAs with values_drop_na and then do the arrange on all the columns

library(tidyr)
library(dplyr)
pivot_longer(df1, cols = starts_with('Col'), values_to = 'NewCol', 
        names_to = NULL, values_drop_na = TRUE) %>%
     arrange(across(everything()))

-output

# A tibble: 8 × 2
     id NewCol
  <int> <chr> 
1     1 A     
2     1 B     
3     1 C     
4     2 A     
5     2 B     
6     2 C     
7     3 A     
8     3 C     

data

df1 <- structure(list(id = c(1L, 1L, 2L, 2L, 3L), Col1 = c("A", NA, 
NA, "A", "A"), Col2 = c(NA, "B", "B", NA, NA), Col3 = c("C", 
NA, NA, "C", "C")), class = "data.frame", row.names = c(NA, -5L
))

CodePudding user response:

As an alternative we could use splitshakeshape packackge:

library(splitstackshape)
library(dplyr)

df %>% 
  merged.stack(df, id.vars="id", 
             var.stubs="Col", sep = ".") %>% 
  na.omit() %>% 
  select(-.time_1)
   id Col
1:  1   A
2:  1   B
3:  1   C
4:  2   A
5:  2   B
6:  2   C
7:  3   A
8:  3   C
  • Related