Home > database >  How to select all values from some ID based on condition in other column?
How to select all values from some ID based on condition in other column?

Time:12-15

I got a data frame like in an example and I would select all values all IDs and all of the values from that ID when the condition is met. Condition, in this case, would be that path must contain "one".

df <- data.frame(id=c(1, 1, 1, 2, 2, 2, 3, 3, 3), 
                 path=c("one", "two", "three", "four", "oned", "five", "six", 
                        "seven", "eight"))

Expected result:

result <- data.frame(id=c(1, 1, 1, 2, 2, 2), 
                     path=c("one", "two", "three", "four", "oned", "five"))

What is the most elegant way of doing this?

CodePudding user response:

It might not be the most elegant way but it's my approach :

my_df <- data.frame(id = c(1,1,1,2,2,2,3,3,3),
                path = c("one","two","three", "four", "oned", "five","six", "seven", "eight"))

my_value <- my_df %>% group_by(id) %>% mutate(Test = grepl(pattern = "one", x = path)) %>% filter(Test == TRUE)
my_var <- which(my_df$id %in% my_value$id)
if (length(my_var)) {
  my_df <- my_df[my_var,]
}

CodePudding user response:

Usinng grepl in ave.

df[with(df, as.logical(ave(path, id, FUN=\(x) any(grepl('one', x))))), ]
#   id  path
# 1  1   one
# 2  1   two
# 3  1 three
# 4  2  four
# 5  2  oned
# 6  2  five

Data:

df <- structure(list(id = c(1, 1, 1, 2, 2, 2, 3, 3, 3), path = c("one", 
"two", "three", "four", "oned", "five", "six", "seven", "eight"
)), class = "data.frame", row.names = c(NA, -9L))

CodePudding user response:

A one row code can be this:

result <- df[df$id %in% df[grepl('one',df$path),"id"],]

It's just combine native [ ] operator and grepl function.

CodePudding user response:

We can use dplyr for that. just group_by ID, then filter groups with any(str_detect('one')):

library(dplyr)
library(stringr)

df %>% group_by(ID) %>%
       filter(any(str_detect(path, 'one')))
  • Related