Home > OS >  R split string by special character " " and categorize the split strings based on conditio
R split string by special character " " and categorize the split strings based on conditio

Time:04-09

Hello I need to split the strings in mfn_rate by the special character with white spaces " " when it appears, then return the numeric value before it in mfn_av and the rest of the string after it in mfn_spec

I am trying something like the below:

mfn_rate<-c("25%","25%   2 GBP/tonne","2 GBP per tonne","10%")
mfn_av<-""
mfn_spec<-""
mfn<-data.frame(mfn_rate,mfn_av,mfn_spec)

for (i in 1:nrow(mfn)){
    #if the value in mfn_rate is not a single percentage value, classify it into either mfn_av or mfn_spec
    if(grepl(" ",mfn$mfn_rate[i])){
       mfn$mfn_spec[i] <- str_split(mfn$mfn_rate[i],"\\ ", "print string after  ")
       mfn$mfn_av[i] <- str_split(mfn$mfn_rate[i],"\\ ", "print numbers before   sign")
    }
    else ( mfn$mfn_rate[i] <- mfn$mfn_av[i])
}

The output should be e.g.:

mfn_rate = 25%   2 GBP/tonne
mfn_av = 25%
mfn_spec = 2 GBP/tonne

CodePudding user response:

Using separate and str_detect within the tidyverse

library(tidyverse)
mfn_rate<-c("25","25%   2 GBP/tonne","2 GBP per tonne","10")
mfn <- data.frame(mfn_rate)

mfn <- mfn %>% 
  separate(mfn_rate, c("mfn_av", "mfn_spec"), " \\  ", remove=F) %>% 
  mutate(mfn_spec = if_else(str_detect(mfn_av, "[:alpha:]"), mfn_av, mfn_spec),
         mfn_av = if_else(str_detect(mfn_av, "[:alpha:]"), NA_character_, mfn_av))

Output

> mfn
           mfn_rate mfn_av        mfn_spec
1                25     25            <NA>
2 25%   2 GBP/tonne    25%     2 GBP/tonne
3   2 GBP per tonne   <NA> 2 GBP per tonne
4                10     10            <NA>
  • Related