I know a lot of people have already posted some issues related to mine, but I couldn't found the correct solution.
I have a lot of sentences like: "Therapie: I like the elephants so much Indication"
I want to extract all the words between "Therapie:" and "Indication" in the provided example above would it be "I like the elephants so much".
When I use my code I get always the next 3 words back. What am I doing wrong?
my_df <- c("Therapie: I like the elephants so much Indication")
exc <- sub(".*?\\bTherapie\\W (\\w (?:\\W \\w ){0,2}).*", "\\1", my_df, to = "documents")`, perl=TRUE)
CodePudding user response:
With str_match
:
str <- "Therapie: I like the elephants so much Indication"
str_match(str, "Therapie: \\s*(.*?)\\s* Indication")[, 2]
# [1] "I like the elephants so much"
CodePudding user response:
You can do
str <- "Therapie: I like the elephants so much Indication"
sub("^Therapie: (.*) Indication$", "\\1", str)
#> [1] "I like the elephants so much"