Home > Net >  How can I apply list of functions consecutively to a variable?
How can I apply list of functions consecutively to a variable?

Time:08-14

I have a list of functions I want to apply to a string consecutively, changing the string. For example, a list of regular expressions I want to remove from a string, e.g.

to_remove = c('a','b')
original = 'abcabd'

In this case, I could use a simple regular expression, e.g. 'a|b'.

library(stringr)
str_remove( original, paste0( to_remove, collapse='|'))

The actual situation is more complex than this, and the regular expression gets a little hairy. Also, I am curious how to do this in a proper R way.

Another way to phrase the question is, "How can I implement the following for loop using a vector approach?"


for( rem in to_remove) {
      original = str_remove( original, rem )
}


'

CodePudding user response:

stringi offers vectorized string replacement. The replacement argument can take on a vector of length 1, "", or a vector of the same length.

to_remove = c('a','b')
original = 'abcabd'

stringi::stri_replace_all_fixed(
    str = original,
    pattern = to_remove,
    replacement = "",
    vectorize_all = F
)
#> [1] "cd"

For repeated application of patterns, perhaps use an alternative pattern, per Stéphane's comment, replace_all_* with vectorize = FALSE is recursive, see for example the output of

stri_replace_all_regex(
    "abc", 
    pattern = c("^a", "^b"),
    replacement = "", 
    vectorise_all = FALSE
)
#> [1] "c"

CodePudding user response:

I don't think that str_remove_all can take a list of inputs, but you can just use str_replace_all with a named vector to circumvent the issue:

str_replace_all("abcdabcd", c("^a" = "",
                              "b" = "",
                              "c" = "",
                              "d$" = ""))

[1] "da"

This solution works with regular expressions as well. The str_remove functions are just an alias for str_replace(x, ""), so there is no loss in computational speed.

Alternatively, you can set the names of an empty vector as the regular expressions as such:

to_remove <- c('a','b', )
empty <- rep("", length(to_remove))
names(empty) <- to_remove

str_replace_all("abcd", empty)

Or you can sequentially pipe the expression into a sequence of str_remove calls as so:

"abcd" %>%
  str_remove_all("d$") %>%
  str_remove_all("a")
  • Related