How do I remove any string that is not followed by a number?
For example, I am working with a string:
string <- c("sb 221 reeb; ab 1355",
"sb 140; sb 14 c ab 1089",
"sb 1518; sb 1067 l ab 1770",
"ab 60 na; ab 1492",
"ab 442 aramb; ab 724; ab 919",
"sb 511 ab 416 state ab 1532")
df <- data.frame(string)
I would like the string to be:
output<- c("sb 221; ab 1355",
"sb 140; sb 14 ab 1089",
"sb 1518; sb 1067 ab 1770",
"ab 60; ab 1492",
"ab 442; ab 724; ab 919"
"sb 511 ab 416 ab 1532")
output_df <- data.frame(output)
Thank you.
CodePudding user response:
gsub(" ?[a-z] ((?= \\D)|;)", "\\1", string, perl = TRUE)
[1] "sb 221; ab 1355" "sb 140; sb 14 ab 1089"
[3] "sb 1518; sb 1067 ab 1770" "ab 60; ab 1492"
[5] "ab 442; ab 724; ab 919" "sb 511 ab 416 ab 1532"