Home > Software engineering >  Splitting Raw data column till n delimiters from last
Splitting Raw data column till n delimiters from last

Time:11-26

Hi i am trying to split the column
Below is my df

value = c("AB/cc/dd/id,1,3,33","CC/DD/EE/F,F/GG,22,33,4","AB/cc,22,2,34","KK/SS/G,G,3,22,41")
df = data.frame(value)

I am trying to split the column and get the string untill 3rd "comma(,)" from last
i.e my output df should look like below

value1 = c("AB/cc/dd/id","CC/DD/EE/F,F/GG","AB/cc","KK/SS/G,G")
df_out = data.frame(value1)

I used stringr package to get it done

library(stringr)

df[c('col1', 'col2')] <- str_split_fixed(df$value, ',', 2)

Thanks in advance

CodePudding user response:

Here is another way to get the strings until the 3rd comma from the last without regex:

df$value |> 
str_split(",") |> 
map(function(x)  x[1: (length(x)-3)] |> 
str_c(collapse = ","))  |> 
map_df(as.data.frame) |> 
setNames("value1")

#           value1
#1     AB/cc/dd/id
#2 CC/DD/EE/F,F/GG
#3           AB/cc
#4       KK/SS/G,G

CodePudding user response:

Using gsub:

gsub("[^[:alpha:],/]", "", value) |> gsub(", $", "", .)
[1] "AB/cc/dd/id"     "CC/DD/EE/F,F/GG" "AB/cc"           "KK/SS/G,G"

Regex explanation:

"[^[:alpha:],/]"

  • []: defines a character list
  • ^: negates that list, gsub will look to match anything that isn't in the list
  • [:alpha:],/: the list contents, letters, commas, and "/", respectively

", $"

  • ,: matches a comma
  • : that might appear one or more times
  • $: only at the end of the string

CodePudding user response:

You can try gsub within base R like below

> gsub("(,[^,] ){3}$", "", value)
[1] "AB/cc/dd/id"     "CC/DD/EE/F,F/GG" "AB/cc"           "KK/SS/G,G"

CodePudding user response:

Just in case there are not only numbers in between the last 3 commas, but maybe any other alphanumeric (including /) you could use this:

a <- "AB/cc/dd/id,1,/gg/,33"

stringr::str_extract(a, ".*(?=(\\,[/A-z0-9] ){3})")
#> [1] "AB/cc/dd/id"

or another base R solution:

gsub("(\\,.*){3}$", "", a)
  • Related