Say I have the following string -
vector <- "this is a string of text containing stuff. something.com [email protected] and other stuff with something.anything"
I would like to remove a string if it contains @
or .
, so I would like to remove something.com
, [email protected]
and something.anything
. I do not want to remove stuff
because it's the end of a sentence and does not contain .
. Ideally I would like to be able to use the %>%
pipe to do this.
CodePudding user response:
gsub(" ?\\w [.@]\\S ", "", vector)
[1] "this is a string of text containing stuff. and other stuff with"
CodePudding user response:
An alternative to the (much more terse/simple) gsub
method:
gre <- gregexpr("[^ ] [.@][^ ] ", vector)
regmatches(vector, gre)
# [[1]]
# [1] "something.com" "[email protected]" "something.anything"
regmatches(vector, gre) <- ""
vector
# [1] "this is a string of text containing stuff. and other stuff with "
This has the advantage of being able to replace them arbitrarily. Granted, we're just replacing them here with ""
, so this is a little overkill, but if you need to change the values somehow (change each substring), then this is a more powerful mechanism.