Home > Blockchain >  split column into 3 columns based on second and last underscore in r
split column into 3 columns based on second and last underscore in r

Time:08-29

How can we split a column into 3 columns based on the second and last underscore?

library(tidyverse)

tbl = tibble(x = c("alpha_beta_gamma_delta", "a_b_c_d_e_h_2022", "hello123_stack_overflow_users"))

The desired result is below

desired_result = tibble(
x1 = c("alpha_beta", "a_b", "hello123_stack"),
x2 = c("gamma", "c_d_e_h", "overflow"),
x3 = c("delta", "2022", "users")
)

CodePudding user response:

The tidyr extract function allows one to specify how you would like to split a string and how many columns you would like to return.

Using your example, one can do:

library(tidyr)

extract(tbl, col = x, regex = "(. ?_. ?)_(. )_(. )", into = paste0("x", 1:3))

  x1             x2       x3   
  <chr>          <chr>    <chr>
1 alpha_beta     gamma    delta
2 a_b            c_d_e_h  2022 
3 hello123_stack overflow users
  • Related