Home > other >  How to split a column by multiple delimiters into two seperate columns
How to split a column by multiple delimiters into two seperate columns

Time:11-09

Here's a sample of my data :

k <- structure(list(Required.field = c("yes", "yes", "yes"),
                    Choices = c("2, Féminin | 1, Masculin", "1, Oui | 0, Non | 99, Je ne sais pas", "1, Oui | 0, Non")),
               row.names = c(5L, 10L, 15L), class = "data.frame") 
> k
   Required.field                  Choices
5             yes            2, Fémenin| 1, Masculin
10            yes            1, Oui | 0, Non | 99, Je ne sais pas
15            yes            1, Oui | 0, Non

What i'd like to have is something like this :

> result

   Required.field            Number       Value
5             yes            c(2,1)       c(Fémenin, Masculin)
10            yes            c(1,0,99)    c(Oui, Non, Je ne sais pas)
15            yes            c(1,0)       c(Oui, Non)

here's the code i write which doesn't do the job correctly !

k$test = strsplit(k$choice,c(" | "), fixed = T)


bbl = k %>% 
  mutate(number = str_extract_all(test, "[0-9] ")) %>% #get only digits
  mutate(value  = str_extract(test, "[aA-zZ].*")) #get only letters 

why is it not working exactly?

CodePudding user response:

Here's a solution with tidyr and dplyr functions:

library(tidyr)
library(dplyr)

dat %>% 
  mutate(id = 1:n()) %>% 
  separate_rows(Choices, sep = " \\| ") %>% 
  separate(Choices, into = c("Number", "Value"), sep = ", ", convert = TRUE) %>% 
  group_by(id) %>% 
  summarise(Required.field = unique(Required.field),
            across(c(Number, Value), list)) 

output

  id Required.field   Number                    Value
1  1            yes     2, 1       Féminin, Masculin
2  2            yes 1, 0, 99 Oui, Non, Je ne sais pas
3  3            yes     1, 0                 Oui, Non

CodePudding user response:

We may use

library(dplyr)
library(stringr)
k %>% 
   mutate(Number = str_extract_all(Choices, "\\d "),
   Value = str_extract_all(Choices, "[^0-9,| ] ") )

-output

 Required.field                              Choices   Number                       Value
5             yes            2, Féminin | 1, Masculin     2, 1          Féminin, Masculin
10            yes 1, Oui | 0, Non | 99, Je ne sais pas 1, 0, 99 Oui, Non, Je, ne, sais, pas
15            yes                      1, Oui | 0, Non     1, 0                    Oui, Non
  • Related