Home > Back-end >  How to split a column into three columns based on separator and/or position in r
How to split a column into three columns based on separator and/or position in r

Time:06-07

I have this fake data set with a column named location.

data.1 <-read.csv(text = "
location
01-Q2-Locate of este (TT)
02-Q2-Green River (OG)
01-Q1-Agabe (PS)
")

I need to partition the location column into three columns named code, name, and region.

This is my intended outcome:

enter image description here

This is the code I am using:

library(dplyr)
library(tidyr)
data.1.new <- data.1 %>%
  tidyr::separate(location, c("code", "name", "region"), extra = "merge", remove = FALSE)

The outcome I am getting is this:

enter image description here

I couldn't find a way to separate to solve this.

CodePudding user response:

data.1 %>%
    extract(location, c('code', 'name', 'region'), '(.*)-([^(] ) \\((.*)\\)')

   code           name region
1 01-Q2 Locate of este     TT
2 02-Q2    Green River     OG
3 01-Q1          Agabe     PS

CodePudding user response:

Another possible solution:

library(tidyverse)

data.1 %>% 
  mutate(code = str_extract(location, "^\\d -\\D\\d "),
         name = str_extract(location, "(?<=-)\\D (?=\\s\\()"),
         zegion = str_extract(location, "(?<=\\().*(?=\\))"))

#>                    location  code           name zegion
#> 1 01-Q2-Locate of este (TT) 01-Q2 Locate of este     TT
#> 2    02-Q2-Green River (OG) 02-Q2    Green River     OG
#> 3          01-Q1-Agabe (PS) 01-Q1          Agabe     PS
  • Related