Home > Enterprise >  Iteratively updating a string-matching pattern
Iteratively updating a string-matching pattern

Time:06-23

I am trying to iteratively update the pattern I am looking for in a text.

  • I know the length of the desired words; it is a constant, say 4
  • I know the type of characters in the desired words; alphabetical, lowercase
  • Unless I deliberately plug in a specific character in a particular index, the rest can be any letter (a-z, lowercase)

How can I go about changing the pattern iteratively?

For instance:

words <- c("bake", "tree", "keep", "game", "ride", "Bake", "Apple", "lame")

pattern1 <- '****'
# return "bake", "tree", "keep", "game", "ride", "lame"
pattern2 <- '*a**'
# return "bake", "game", "lame"
pattern3 <- '*a*e'
# return "bake", "game", "lame"
pattern4 <- '*ame'
# return "game", "lame"

Thank you!

CodePudding user response:

We can use grep here:

words <- c("bake", "tree", "keep", "game", "ride", "Bake", "Apple", "lame")
words <- grep("^.{4}$", words, value=TRUE) # "bake" "tree" "keep" "game" "ride" "Bake" "lame"
words <- grep("^.a..$", words, value=TRUE) # "bake" "game" "Bake" "lame"
words <- grep("^.a.e$", words, value=TRUE) # "bake" "game" "Bake" "lame"
words <- grep("^.ame$", words, value=TRUE) # "game" "lame"

CodePudding user response:

We can use Reduce (with accumulate enabled) to update word iteratively

Reduce(
    function(x, y) grep(y, x, value = TRUE),
    c("^.{4}$", "^.a..$", "^.a.e$", "^.ame$"),
    c("bake", "tree", "keep", "game", "ride", "Bake", "Apple", "lame"),
    accumulate = TRUE
)

which gives

[[1]]
[1] "bake"  "tree"  "keep"  "game"  "ride"  "Bake"  "Apple" "lame"

[[2]]
[1] "bake" "tree" "keep" "game" "ride" "Bake" "lame"

[[3]]
[1] "bake" "game" "Bake" "lame"

[[4]]
[1] "bake" "game" "Bake" "lame"

[[5]]
[1] "game" "lame"

CodePudding user response:

I am not a fan of Tim's solution because "." does not account for occurrences with capitalization or other symbols.

You can write it in a function like this to give you more flexibility:

spec_grep <- function(string, pattern) {
  pattern <- gsub("\\*", "[a-z]", pattern)
  grep(paste0("^", pattern, "$"), string, value = T)
}

words <- c("bake", "tree", "keep", "game", "da4e", "Bake", "Apple", "lame")
pattern <- "*a*e"
spec_grep(words, pattern)

[1] "bake" "game" "lame"
  • Related