I am strugling with regex.
I have this character vector bellow:
texts <- c('I-have-text-2-and-text-8','I-have-text-1-and-text-2','I-have-text-7-and-text-8','I-have-text-2-and-text-1','I-have-text-4-and-text-5','I-have-text-11-and-text-12','I-have-text-13-and-text-32','I-have-text-8-and-text-6')
I have two words important to me: text-1
and text-2
. And I need them both, in any order.
I want to extract the text with them.
The output should be: [1]'I-have-text-1-and-text-2' [2]I-have-text-2-and-text-1
Ive been using str_subset from stringr
but I dont know the regex expression for this.
str_subset(texts, 'regex')
Any help
CodePudding user response:
Using str_subset
- regex
would be to specify text-1
followed by characters (.*
) and then text-2
or (|
) in the reverse way
library(stringr)
str_subset(texts, 'text-1.*text-2|text-2.*text-1')
[1] "I-have-text-1-and-text-2" "I-have-text-2-and-text-1"
CodePudding user response:
"Both patterns in any order" sounds complicated for a single regex pattern, but trivial to do in two separate patterns:
texts[str_detect(texts, "text-1") & str_detect(texts, "text-2")]
# [1] "I-have-text-1-and-text-2" "I-have-text-2-and-text-1"
CodePudding user response:
You can use an alternation pattern with |
to alternate between text-1
followed by text-2
and vice versa:
grep("text-1.*text-2|text-2.*text-1", texts, value = TRUE)
[1] "I-have-text-1-and-text-2" "I-have-text-2-and-text-1"
The stringr
equivalent would be:
str_subset(texts, "text-1.*text-2|text-2.*text-1")