Home > Mobile >  How to use grep in Data frame in R
How to use grep in Data frame in R

Time:09-25

I made text comments in CSV file into dataframe such as

1.Mary had a little lamb and she is sweet.
2.Robin is a great superhero.
3.batman dark knight is wonderful movie.
4.Superman series has been a disappointment.

I tried to use

grep('batman',df,value =t)

it is giving as

"Mary had a little lamb and she is sweet.\",\n\"Robin is a great superhero.\"\n\"batman dark knight is wonderful movie.\",\n\"Superman series has been a disappointment.\"

instead, I would like the result as

batman dark night is a wonderful movie

only the above sentence should be displayed

CodePudding user response:

Using grep:

df[grep('batman',df$text,value =t),]

using stringr

library(dplyr)
library(stringr)

df %>% 
  filter(str_detect(text, "batman"))

data

df <- data.frame(text = c("Mary had a little lamb and she is sweet.",
           "Robin is a great superhero.",
           "batman dark knight is wonderful movie.",
           "Superman series has been a disappointment."))

CodePudding user response:

# Your dataframe
df = data.frame( Text = c("Mary had a little lamb and she is sweet."
                          ,"Robin is a great superhero."
                          ,"batman dark knight is wonderful movie."
                          ,"Superman series has been a disappointment."))

# get the index which has batman in the text of your dataframe
df[grep("batman", df$Text),]

Outputs

[1] "batman dark knight is wonderful movie."

In dplyr with grepl (which returns not a number but a logical value)

library(dplyr)
df %>% filter(grepl("batman", Text))

Outputs

                                    Text
1 batman dark knight is wonderful movie.

CodePudding user response:

We may need to add word boundary (\\b) to avoid any non-specific matches

 subset(df, grepl("\\bbatman\\b", text))
                                    text
3 batman dark knight is wonderful movie.
  •  Tags:  
  • r
  • Related