Home > Enterprise >  Searching a long dataframe for a string and returning all other strings with matching identifier
Searching a long dataframe for a string and returning all other strings with matching identifier

Time:10-04

I have a long dataset of around 15,000 rows that looks like this

df <- data.frame("id" = c(3,3,3,55,55,55,63,63,63), "name" = c("house","home","apartment","boat","ship","sailboat","car","automobile","truck"))

I am trying to develop a function that searches for a string within the "name" vector of this dataframe and returns all strings with a corresponding "id".

For example, an input of "house" returns "house, home, "apartment" because they're all matching IDs as house, 3.

CodePudding user response:

input = "house"
library(dplyr)
df %>%
  group_by(id) %>%
  filter(input %in% name) %>%
  ungroup()
# # A tibble: 3 × 2
#      id name     
#   <dbl> <chr>    
# 1     3 house    
# 2     3 home     
# 3     3 apartment

As a function,

foo = function(data, input) {
  data %>%
    group_by(id) %>%
    filter(input %in% name) %>%
    ungroup()
}
foo(df, "home")
#  A tibble: 3 × 2
#      id name     
#   <dbl> <chr>    
# 1     3 house    
# 2     3 home     
# 3     3 apartment
  • Related