Dear kind people of the internet, I need help. I am trying to split a string column into several columns and keep the null/NA entries.
df <- cSplit(df, "question", "_")
This code currently splits them but removes the null entries and shows the following warming message:
Warning messages:
1: In type.convert.default(X[[i]], ...) :
'as.is' should be specified by the caller; using TRUE
df
client_id question
15962 eng_child_pregnancy_standard_focused
15963 null
15964 xho_child_developed_sleep
15965 eng_mother_spacing_other
15966 null
15967 null
Current split df using the above code:
client_id question question_2 question_3 question_4 question_5
15962 eng child pregnancy standard focused
15964 xho child developed sleep NA
15965 eng mother spacing other NA
How do I keep the null entries and what does the warning mean?
CodePudding user response:
From ?cSplit
, we can see that there's a type.convert
argument in the function that would change the type of the result in each column (i.e. figure out whether the column should be numeric, logical or character). The warning message about as.is
is from the utils::type.convert()
function, which is used if type.convert
is set to TRUE
.
Therefore to avoid the message, use type.convert = FALSE
.
library(splitstackshape)
cSplit(df, "question", "_", type.convert = FALSE)
CodePudding user response:
A tidyverse solution would be
library(tidyverse)
cols <- paste0("question_", 1:5)
df %>% separate(question, sep = "_", into = cols )
Which gives
# A tibble: 6 x 6
client_id question_1 question_2 question_3 question_4 question_5
<dbl> <chr> <chr> <chr> <chr> <chr>
1 15962 eng child pregnancy standard focused
2 15963 null NA NA NA NA
3 15964 xho child developed sleep NA
4 15965 eng mother spacing other NA
5 15966 null NA NA NA NA
6 15967 null NA NA NA NA