Home > OS >  Splitting string column into a few columns - keep the null entries
Splitting string column into a few columns - keep the null entries

Time:04-09

Dear kind people of the internet, I need help. I am trying to split a string column into several columns and keep the null/NA entries.

df <- cSplit(df, "question", "_")

This code currently splits them but removes the null entries and shows the following warming message:

Warning messages:
1: In type.convert.default(X[[i]], ...) :
  'as.is' should be specified by the caller; using TRUE

df

client_id   question
15962       eng_child_pregnancy_standard_focused
15963       null
15964       xho_child_developed_sleep
15965       eng_mother_spacing_other
15966       null
15967       null

Current split df using the above code:

client_id   question       question_2      question_3      question_4     question_5
15962       eng            child           pregnancy       standard      focused
15964       xho            child           developed       sleep         NA
15965       eng            mother          spacing         other         NA

How do I keep the null entries and what does the warning mean?

CodePudding user response:

From ?cSplit, we can see that there's a type.convert argument in the function that would change the type of the result in each column (i.e. figure out whether the column should be numeric, logical or character). The warning message about as.is is from the utils::type.convert() function, which is used if type.convert is set to TRUE.

Therefore to avoid the message, use type.convert = FALSE.

library(splitstackshape)

cSplit(df, "question", "_", type.convert = FALSE)

CodePudding user response:

A tidyverse solution would be

library(tidyverse)
cols <- paste0("question_", 1:5)
df %>% separate(question, sep = "_", into = cols )

Which gives

# A tibble: 6 x 6
  client_id question_1 question_2 question_3 question_4 question_5
      <dbl> <chr>      <chr>      <chr>      <chr>      <chr>     
1     15962 eng        child      pregnancy  standard   focused   
2     15963 null       NA         NA         NA         NA        
3     15964 xho        child      developed  sleep      NA        
4     15965 eng        mother     spacing    other      NA        
5     15966 null       NA         NA         NA         NA        
6     15967 null       NA         NA         NA         NA   
  • Related