Based on the data below how can I filter data by values stored in another dataframe object?
Sample data:
# Data to be filtered
Dest_FIPS = c(1,2,3,4)
Dest_county = c("West Palm Beach County","Brevard County","Bay County","Miami-Dade County")
Dest_State = c("FL", "FL", "FL", "FL")
OutFlow = c(111, 222, 333, 444)
Orig_county = c("Broward County", "Broward County", "Broward County", "Broward County")
Orig_FIPS = c(5,5,5,5)
Orig_State = c("FL", "FL", "FL", "FL")
df = data.frame(Dest_FIPS, Dest_county, Dest_State, OutFlow, Orig_county, Orig_FIPS, Orig_State)
# rows to be filtered in column Dest_county based on the values in val_df
COUNTY_NAM = c("WEST PALM BEACH","BAY","MIAMI-DADE") #(values are actually stored in a CSV, so will be imported as a dataframe)
val_df = data.frame(COUNTY_NAM) # will use val_df to filter df
Desired output:
Dest_FIPS Dest_county OutFlow Orig_county
1 West Palm Beach County 111 Broward County
3 Bay County 333 Broward County
4 Miami-Dade County 444 Broward County
CodePudding user response:
Transform df$Dest_county
to match the format in val_df
, then check which values are %in% val_df$COUNTY_NAM
.
Base R:
df[toupper(gsub(" County", "", df$Dest_county)) %in% val_df$COUNTY_NAM,]
tidyverse:
library(dplyr)
filter(df, str_to_upper(str_remove(Dest_county, " County")) %in% val_df$COUNTY_NAM)
Output for both:
Dest_FIPS Dest_county Dest_State OutFlow Orig_county Orig_FIPS Orig_State
1 1 West Palm Beach County FL 111 Broward County 5 FL
2 3 Bay County FL 333 Broward County 5 FL
3 4 Miami-Dade County FL 444 Broward County 5 FL