I am working on a data that has a text variable in it and I am not good in cleaning texts. I tried my best but it is just hard to find the answer. Let's take this text as example:
"I want. to remove all ... from the text except 5.3 or .5"
I want the output to be:
"I want to remove from the text except 5.3 or .5"
Could someone help me with that?
CodePudding user response:
You could ry:
library(stringr)
str_remove_all("I want to remove all ... from the text except 5.3.", "((?<!\\d)\\.(?!\\d)|\\.$)")
#> [1] "I want to remove all from the text except 5.3"
There are two parts in an or bracked (...|...)
, the first (?<!\\d)\\.(?!\\d)
says 'remove periods that don't have a number just before and after', and the second \\.$
makes sure it removes the last one (which doesn't get picked up by the first part).
CodePudding user response:
You can try gsub
like below
> gsub("(?<=\\D)\\. (?=\\D)", "", "I want. to remove all ... from the text except 5.3 or .5", perl = TRUE)
[1] "I want to remove all from the text except 5.3 or .5"