Home > Software design >  strsplit char into columns with column names
strsplit char into columns with column names

Time:06-22

I have a character string such as:

char = c("3 habs.", "2 baños", "102 m²", "4ª Planta")

I would like to split this into columns with the expected output:

habs baños m²   Planta
  3    2   102   4^a 

I have the following:

char %>% 
  strsplit(" ") %>% 
  bind_cols()

Which gives:

New names:
• `` -> `...1`
• `` -> `...2`
• `` -> `...3`
• `` -> `...4`
# A tibble: 2 × 4
  ...1  ...2  ...3  ...4  
  <chr> <chr> <chr> <chr> 
1 3     2     102   4ª    
2 habs. baños m²    Planta

Which is not exactly what I want, I would like row 2 to be the column names (I also don't want to use the janitor::row_to_names() function to do this - I would like to perhaps rename the lists first, then bind_cols()/bind_rows()

CodePudding user response:

In Base R:

as.data.frame(t(sapply(strsplit(char, ' '), \(x)setNames(x[1], x[2]))))
  habs. baños  m² Planta
1     3     2 102     4ª

or even:

read.dcf(textConnection(sub("(\\S ) (\\S )", '\\2:\\1', char)), all = TRUE)

  habs. baños  m² Planta
1     3     2 102     4ª

in tidyverse:

  str_split(char, ' ')%>%transpose()%>%invoke(set_names, .)%>%as_tibble()

# A tibble: 1 x 4
  habs. baños m²    Planta
  <chr> <chr> <chr> <chr> 
1 3     2     102   4ª 

CodePudding user response:

Following your strsplit approach further.

strsplit(char, ' ') |>
  {\(.) as.data.frame(setNames(lapply(., `[`, 1), lapply(., `[`, 2)))}()
#   habs. baños  m. Planta
# 1     3     2 102     4ª
  • Related