Home > Enterprise >  str_detect removing some but not all strings with specified ending
str_detect removing some but not all strings with specified ending

Time:04-03

I'd like to remove any string that ends in either of 2 characters in a pipe. In this example it's ".o" or ".t". Some of them get removed, but not all of them, and I can't figure out why. I suspect something is wrong in the 'pattern = ' argument.

ex1 <- structure(list(variables = structure(1:18, .Label = c("canopy15", 
"canopy16", "DistanceToRoad", "DistanceToEdge", "EdgeDistance", 
"TrailDistance", "CARCOR.o", "EUOALA.o", "FAGGRA.o", "LINBEN.o", 
"MALSP..o", "PRUSER.o", "ROSMUL.o", "RUBPHO.o", "VIBDEN.o", "ACERUB.t", 
"FAGGRA.t", "NYSSYL.t"), class = "factor")), row.names = c(NA, 
-18L), class = "data.frame")

ex1 %>%
dplyr::filter(stringr::str_detect(string = variables,
pattern = c("\\.o$", "\\.t$"),
negate = TRUE))

##output
# variables
# 1        canopy15
# 2        canopy16
# 3  DistanceToRoad
# 4  DistanceToEdge
# 5    EdgeDistance
# 6   TrailDistance
# 7        EUOALA.o
# 8        LINBEN.o
# 9        PRUSER.o
# 10       RUBPHO.o
# 11       FAGGRA.t

CodePudding user response:

The pattern has multiple elements, so it is recycling, and thus checking o$ for one row, and then t$ for the next row, and so on.. Try this instead:

ex1 %>%
  dplyr::filter(stringr::str_detect(string = variables,
                                    pattern = c("\\.(o|t)$"),
                                    negate = TRUE))
  • Related