Home > Blockchain >  R commas and count issue in string
R commas and count issue in string

Time:06-10

Hope someone can help and already thanks who will. Hi have my dataframe that looks as follow:

V1                     
A4 B1 C2    E3   T5
B3     R4     W2 E3

I have words divided by spaces, and is not nice to see it this way, therefore I am trying to do 2 things: 1) separate them by comma, but the number of spaces between words is not always the same and if I use gsub seems not to solve the issue, if I do

x$V1=gsub(" ",",",x$V1)

I get:

V1                     
A4,B1,C2,,,,E3,,,,T5
B3,,,R4,,,,W2,E3

Then problem number 2) I want a V2 column with the number of values, but using

x$V2 = length(strsplit(x$V1, ",")

is not helping :( My desired output is:

V1              V2                 
A4,B1,C2,E3,T5  5
B3,R4,W2,E3     4

CodePudding user response:

A possible solution in base R:

df$V1 = gsub("\\s ",",",df$V1)

df$V2 <-  1   lengths(regmatches(df$V1, gregexpr(",", df$V1)))
df

#>               V1 V2
#> 1 A4,B1,C2,E3,T5  5
#> 2    B3,R4,W2,E3  4

Another possible solution, based on stringr:

library(tidyverse)

df %>% 
  mutate(V1 = str_replace_all(V1, "\\s ", ","), V2 = str_count(V1, ",")   1)

#>               V1 V2
#> 1 A4,B1,C2,E3,T5  5
#> 2    B3,R4,W2,E3  4
  • Related