Home > Blockchain >  I want to get rid of words with exact pattern in character vector using regular expressions but don&
I want to get rid of words with exact pattern in character vector using regular expressions but don&

Time:10-02

I have this vector

names <- c("wazzzap12waaazzzaaaaapffffm12323", "hell223231", "musssaaaa225")

So I want to remove words saving for numbers in vector's elements which have at least three "z" letters.

CodePudding user response:

We may use grep with invert = TRUE. Specify the regex as z with repeats of 3 or more with {}

grep("z{3,}", names, invert = TRUE, value = TRUE)
[1] "hell223231"   "musssaaaa225"

Or use str_subset from stringr

library(stringr)
str_subset(names, "z{3,}", negate = TRUE)
[1] "hell223231"   "musssaaaa225"

Update

If we want to remove the non-numbers only from those having 'z' repeats

i1 <- grep("z{3,}", names)
names[i1] <- gsub("\\D ", "", names[i1])

-output

> names
[1] "1212323"      "hell223231"   "musssaaaa225" 
  • Related