Home > database >  How to split each element in a string to a row in a new column?
How to split each element in a string to a row in a new column?

Time:08-09

I have this dataset:

data <- data.frame(column = c("apple, banana, cherry",
"apple, banana, cherry, grape", 
"apple, banana, cherry, grape, pear")) 

data

  column
1 apple, banana, cherry
2 apple, banana, cherry, grape
3 apple, banana, cherry, grape, pear

I'd like my output to be:

  column1  column2  column3  column4  column5
1 apple    banana   cherry   NA       NA
2 apple    banana   cherry   grape    NA
3 apple    banana   cherry   grape    pear

I tried: strsplit(data$column, ","), but this returns a list and I struggle to get it back into a dataframe type because the rows are of unequal length and this won't work: as.data.frame(strsplit(data$column, ",")) .

Thanks a lot!

CodePudding user response:

library(tidyr)
separate(data=df, col=column, sep=", ", into=paste0("column_", 1:5))

  column_1 column_2 column_3 column_4 column_5
1    apple   banana   cherry     <NA>     <NA>
2    apple   banana   cherry    grape     <NA>
3    apple   banana   cherry    grape     pear

You could automate the counting of the number of prospective columns with

library(stringr)
max(str_count(df$column, ",")) 1 # returns 5
  • Related