Home > Blockchain >  How to remove all characters before last whitespace in R string but with exceptions for certain char
How to remove all characters before last whitespace in R string but with exceptions for certain char

Time:01-18

I've been using the following to remove all characters before the last whitespace in R character strings: gsub(".*\\s", "", "Big Dog") returns "Dog" which is perfect.

How could I exclude certain patterns from being removed? For example, let's say I always want to preserve "Big Dog", so if I have the string "Look at that crazy Big Dog", running the gsub() (or other code) returns "Big Dog" with that whitespace between Big and Dog retained. In the complete code this is intended for, the equivalent of "Big Dog" isn't dynamic so hard-coding in of "Big Dog" is fine. "Big Dog" will always occupy the last position in a character string too.

CodePudding user response:

As you do not know the Dog beforehand, you can use

sub("^.*?((?:\\bBig\\s )?\\S )$", "\\1", text)

See the regex demo. Note the use of the sub function, you only need to search and replace once in a string.

Details:

  • ^ - start of string
  • .*? - any zero or more chars as few as possible
  • ((?:\bBig\s )?\S ) - Group 1:
    • (?:\bBig\s )? - an optional sequence of a whole word Big (\b is a word boundary) and then one or more whitespace chars (\s )
    • \S - one or more non-whitespace chars
  • $ - end of string.

The \1 replacement puts back the value from Group 1 into the result.

See the R demo:

x <- c("Look at that crazy Dog", "Look at that crazy Big Dog")
sub("^.*?((?:\\bBig\\s )?\\S )$", "\\1", x)
# => [1] "Dog"     "Big Dog"

CodePudding user response:

Assuming you know all the words and phrases which you don't want to replace at the end of a string, you could use the following whitelist approach:

input <- c("Look at that crazy Dog", "Look at that crazy Big Dog")
keep <- c("Big Dog", "Dog")
regex <- paste0(".*?\\b(", paste(keep, collapse="|"), ")$")
output <- sub(regex, "\\1", input)
output  # [1] "Dog"     "Big Dog"
  • Related