I want to create a binary column that shows whether another column, which is a list of characters, contains any value from a vector.
Specifically, I want to create a column that says whether one has experienced their manager leaving a company in the past year. For this, I have a all_manager
column that is a list of all managers one had in the last year. And then, I have a terminated_managers
vector that has all names of managers who have terminated in the past year.
df$all_manager
[[1]]
[1] John Mary
[[2]]
[1] Paul John
[[3]]
[1] Mary Tom Lilly
terminated_managers <- c("Mary", "Bill")
And I want to create manager_termed_yn
column such that:
df$manager_termed_yn
[1]TRUE
[2]FALSE
[3]TRUE
I'll appreciate your help! First time posting, so apologies that the example is not the best.
CodePudding user response:
all_manager <- list(c("John", "Mary"), c("Paul", "John"), c("Mary", "Tom", "Lilly"))
terminated_managers <- c("Mary", "Bill")
We can use
colSums(sapply(all_manager, "%in%", x = terminated_managers)) > 0
#[1] TRUE FALSE TRUE
CodePudding user response:
Let list l
as
l <- list("John Mary","Paul John", "Mary Tom Lilly")
l
[[1]]
[1] "John Mary"
[[2]]
[1] "Paul John"
[[3]]
[1] "Mary Tom Lilly"
Then
sapply(l, function(x) {sum(match(str_split(x, " ", simplify = T), terminated_managers), na.rm = T) == 1}, simplify = T)
[1] TRUE FALSE TRUE