Hi i am trying to split the column
Below is my df
value = c("AB/cc/dd/id,1,3,33","CC/DD/EE/F,F/GG,22,33,4","AB/cc,22,2,34","KK/SS/G,G,3,22,41")
df = data.frame(value)
I am trying to split the column and get the string untill 3rd "comma(,)" from last
i.e my output df should look like below
value1 = c("AB/cc/dd/id","CC/DD/EE/F,F/GG","AB/cc","KK/SS/G,G")
df_out = data.frame(value1)
I used stringr package to get it done
library(stringr)
df[c('col1', 'col2')] <- str_split_fixed(df$value, ',', 2)
Thanks in advance
CodePudding user response:
Here is another way to get the strings until the 3rd comma from the last without regex:
df$value |>
str_split(",") |>
map(function(x) x[1: (length(x)-3)] |>
str_c(collapse = ",")) |>
map_df(as.data.frame) |>
setNames("value1")
# value1
#1 AB/cc/dd/id
#2 CC/DD/EE/F,F/GG
#3 AB/cc
#4 KK/SS/G,G
CodePudding user response:
Using gsub
:
gsub("[^[:alpha:],/]", "", value) |> gsub(", $", "", .)
[1] "AB/cc/dd/id" "CC/DD/EE/F,F/GG" "AB/cc" "KK/SS/G,G"
Regex explanation:
"[^[:alpha:],/]"
[]
: defines a character list^
: negates that list, gsub will look to match anything that isn't in the list[:alpha:],/
: the list contents, letters, commas, and "/", respectively
", $"
,
: matches a comma$
: only at the end of the string
CodePudding user response:
You can try gsub
within base R like below
> gsub("(,[^,] ){3}$", "", value)
[1] "AB/cc/dd/id" "CC/DD/EE/F,F/GG" "AB/cc" "KK/SS/G,G"
CodePudding user response:
Just in case there are not only numbers in between the last 3 commas, but maybe any other alphanumeric (including /) you could use this:
a <- "AB/cc/dd/id,1,/gg/,33"
stringr::str_extract(a, ".*(?=(\\,[/A-z0-9] ){3})")
#> [1] "AB/cc/dd/id"
or another base R solution:
gsub("(\\,.*){3}$", "", a)