I made text comments in CSV file into dataframe such as
1.Mary had a little lamb and she is sweet.
2.Robin is a great superhero.
3.batman dark knight is wonderful movie.
4.Superman series has been a disappointment.
I tried to use
grep('batman',df,value =t)
it is giving as
"Mary had a little lamb and she is sweet.\",\n\"Robin is a great superhero.\"\n\"batman dark knight is wonderful movie.\",\n\"Superman series has been a disappointment.\"
instead, I would like the result as
batman dark night is a wonderful movie
only the above sentence should be displayed
CodePudding user response:
Using grep
:
df[grep('batman',df$text,value =t),]
using stringr
library(dplyr)
library(stringr)
df %>%
filter(str_detect(text, "batman"))
data
df <- data.frame(text = c("Mary had a little lamb and she is sweet.",
"Robin is a great superhero.",
"batman dark knight is wonderful movie.",
"Superman series has been a disappointment."))
CodePudding user response:
# Your dataframe
df = data.frame( Text = c("Mary had a little lamb and she is sweet."
,"Robin is a great superhero."
,"batman dark knight is wonderful movie."
,"Superman series has been a disappointment."))
# get the index which has batman in the text of your dataframe
df[grep("batman", df$Text),]
Outputs
[1] "batman dark knight is wonderful movie."
In dplyr
with grepl
(which returns not a number but a logical value)
library(dplyr)
df %>% filter(grepl("batman", Text))
Outputs
Text
1 batman dark knight is wonderful movie.
CodePudding user response:
We may need to add word boundary (\\b
) to avoid any non-specific matches
subset(df, grepl("\\bbatman\\b", text))
text
3 batman dark knight is wonderful movie.