I'm trying to use an "or" statement while ignoring case with str_detect. I want to convert everything that contains "ag" to "Agricultural" and everything that contains "field" to "Agricultural"
Here's an example:
(dat <-
data.frame(Class = c("ag", "Agricultural--misc", "old field")))
This works:
(dat %>%
mutate(Class_2 = case_when(
# Aggregate all Agricultural classes
str_detect(Class, fixed("Ag", ignore_case=TRUE)) ~ "Agricultural",
# Convert 'field' to Agricultural
str_detect(Class, fixed("field", ignore_case=TRUE)) ~ "Agricultural",
TRUE ~ Class)))
But I want to condense the two lines to just one line, as below:
(dat %>%
mutate(Class_2 = case_when(
# Aggregate all Agricultural and field classes to Agricultural
str_detect(Class, fixed("Ag|field", ignore_case=TRUE)) ~ "Agricultural",
TRUE ~ Class)))
CodePudding user response:
A possible solution would be to bring everything to lower case and match that with ag|field
.
dat %>%
mutate(Class_2 = case_when(
str_detect(string = str_to_lower(Class),
pattern = "ag|field") ~ "Agricultural",
TRUE ~ Class
))
# A tibble: 3 × 2
Class Class_2
<chr> <chr>
1 ag Agricultural
2 Agricultural--misc Agricultural
3 old field Agricultural
CodePudding user response:
I just came across the (?i)
argument.
So another solution would be to remove the fixed()
argument, and add (?i)
to the string like this:
(dat %>%
mutate(Class_2 = case_when(
# Aggregate all Agricultural and field classes to Agricultural
str_detect(Class, "(?i)Ag|field") ~ "Agricultural",
TRUE ~ Class)))
But I think I like the str_to_lower
option, as the code is more readable.
CodePudding user response:
You could also use regex
like this:
dat <- data.frame(Class = c("ag", "Agricultural--misc", "old field"))
library(dplyr)
library(stringr)
dat %>%
mutate(Class_2 = case_when(str_detect(Class, regex('Ag|field', ignore_case = T))~'Agricultural',
TRUE ~ Class))
#> Class Class_2
#> 1 ag Agricultural
#> 2 Agricultural--misc Agricultural
#> 3 old field Agricultural
Created on 2022-07-12 by the reprex package (v2.0.1)