Home > Back-end >  Filter only rows that contain exact two strings in a column
Filter only rows that contain exact two strings in a column

Time:11-13

I have a data.frame as follow:

df = data.frame(sp_name = c("Xylopia brasiliensis", "Xylosma tweediana", "Zanthoxylum fagara subsp. lentiscifolium", "Schinus terebinthifolia var. raddiana", "Eugenia"), value = c(1, 2, 3, 4, 5))

Here's the deal: I am only interested in subsetting/filtering the rows from the df that contain exactly two words (in my case, Xylopia brasiliensis and Xylosma tweediana). How can I proceed? I'm failing miserably in using the filter function from tidyverse

Thanks already.

CodePudding user response:

We can use str_count to create a logical vector in filter

library(dplyr)
library(stringr)
df %>% 
    filter(str_count(sp_name, "\\w ") == 2)

-output

               sp_name value
1 Xylopia brasiliensis     1
2    Xylosma tweediana     2

Or this can be done with str_detect as well - match the word (\\w ) from the start (^) followed by a space and another word (\\w ) at the end ($) of the string

df %>%
    filter(str_detect(sp_name, "^\\w  \\w $"))

Or in base R with grep

subset(df, grepl("^\\w  \\w $", sp_name))
               sp_name value
1 Xylopia brasiliensis     1
2    Xylosma tweediana     2
  • Related