I want to find a word in different columns and mutate it in a new column.
"data" is an example and "goal" is what I want. I tried a lot but I didn't get is work.
library(dplyr)
library(stringr)
data <- tibble(
component1 = c(NA, NA, "Word", NA, NA, "Word"),
component2 = c(NA, "Word", "different_word", NA, NA, "not_this")
)
goal <- tibble(
component1 = c(NA, NA, "Word", NA, NA, "Word"),
component2 = c(NA, "Word", "different_word", NA, NA, "not_this"),
component = c(NA, "Word", "Word", NA, NA, "Word")
)
not_working <- data %>%
mutate(component = across(starts_with("component"), ~ str_extract(.x, "Word")))
CodePudding user response:
For your provided data structure we could use coalesce
:
library(dplyr)
data %>%
mutate(component = coalesce(component1, component2))
component1 component2 component
<chr> <chr> <chr>
1 NA NA NA
2 NA Word Word
3 Word different_word Word
4 NA NA NA
5 NA NA NA
6 Word not_this Word
CodePudding user response:
With if_any
and str_detect
:
library(dplyr)
library(stringr)
data %>%
mutate(component = ifelse(if_any(starts_with("component"), str_detect, "Word"), "Word", NA))
output
component1 component2 component
<chr> <chr> <chr>
1 NA NA NA
2 NA Word Word
3 Word different_word Word
4 NA NA NA
5 NA NA NA
6 Word not_this Word
If you wanna stick to str_extract
, this would be the way to go:
data %>%
mutate(across(starts_with("component"), str_extract, "Word",
.names = "{.col}_extract")) %>%
mutate(component = coalesce(component1_extract, component2_extract),
.keep = "unused")
# A tibble: 6 × 3
component1 component2 component
<chr> <chr> <chr>
1 NA NA NA
2 NA Word Word
3 Word different_word Word
4 NA NA NA
5 NA NA NA
6 different_word Word Word