How to separate multiple columns in a dataframe into multiple columns? #R-CodePudding

My dataset has about 14 columns which look like this

X1	X2	X3	X4
1,2	1	2,3	1,2,3
1	2	3	1,3

I want to separate each column and the resulting data will look like this:

X1_1	X1_2	X1_3	X2_1	X2_2	X2_3	X3_1	X3_2	X3_3	X4_1	X4_2	X4_3
1	2	NA	1	NA	NA	2	3	NA	1	2	3
1	NA	NA	2	NA	NA	3	NA	NA	1	3	NA

While I can do this separately for each column one by one, its not the most efficient way. I want to know if it can be done faster for all columns in one go, using loop_if, regex, or mutate_at or any other way.

CodePudding user response：

library(tidyverse)

df %>%
  mutate(across(everything(), str_split, ","))%>%
  unnest_wider(everything(), names_sep = '_')

# A tibble: 2 × 8
  X1_1  X1_2  X2_1  X3_1  X3_2  X4_1  X4_2  X4_3 
  <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 1     2     1     2     3     1     2     3    
2 1     NA    2     3     NA    1     3     NA

Note: This does not give you columns that are NA. eg X2_2, you could manually add those. Although it is not ideal.

if of course you need the NA columns:

df %>%
  rowid_to_column()%>%
  pivot_longer(-rowid, values_transform = as.character)%>%
  separate_rows(value, convert = TRUE)%>%
  group_by(rowid, name)%>%
  mutate(name1 = row_number())%>%
  ungroup()%>%
  complete(rowid, name, name1)%>%
  pivot_wider(rowid, names_from = c(name, name1), values_from = value)

# A tibble: 2 × 13
  rowid  X1_1  X1_2  X1_3  X2_1  X2_2  X2_3  X3_1  X3_2  X3_3  X4_1  X4_2  X4_3
  <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1     1     1     2    NA     1    NA    NA     2     3    NA     1     2     3
2     2     1    NA    NA     2    NA    NA     3    NA    NA     1     3    NA

CodePudding user response：

do.call(cbind, Map(read.table, text=sapply(df, paste, collapse='\n'), sep=',', fill =TRUE))

  X1.V1 X1.V2 V1 X3.V1 X3.V2 X4.V1 X4.V2 X4.V3
1     1     2  1     2     3     1     2     3
2     1    NA  2     3    NA     1     3    NA
>