I have a dataset looks like this:
df1 <- data.frame(
date = c(20200101, 20200102, 20200103,20200104,20200105,20200106),
z_score = c(TRUE, TRUE, TRUE, FALSE, FALSE, TRUE),
mad_score = c(FALSE, TRUE, TRUE, TRUE, TRUE, TRUE),
history_error = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE),
manual = c(TRUE, TRUE, TRUE, TRUE, FALSE, TRUE)
)
The result needs to like:
date final_result
1 20200101 FALSE
2 20200102 TRUE
3 20200103 TRUE
4 20200104 FALSE
5 20200105 FALSE
6 20200106 TRUE
I can do that by un-pivot to long form and summarize the boolean columns together as:
df1_result <- df1 %>%
pivot_longer(!date, names_to = "conditions", values_to = "T_or_F") %>%
group_by(date) %>%
summarise(T_or_F = as.logical(prod(T_or_F)))
I feel there must be a simpler way to do it. I searched the solution in the forum and found many 1-liner for similar question. I just can't replicate those elegant 1-liner to my issue.
CodePudding user response:
Here's a one liner that uses only base R
data.frame(df1[1], final_result = apply(df1[-1], 1, \(x) sum(x) == 4))
#> date final_result
#> 1 20200101 FALSE
#> 2 20200102 TRUE
#> 3 20200103 TRUE
#> 4 20200104 FALSE
#> 5 20200105 FALSE
#> 6 20200106 TRUE
CodePudding user response:
In tidyverse
, we can use if_all
library(dplyr)
df1 %>%
transmute(date, result = if_all(where(is.logical)))
-output
date result
1 20200101 FALSE
2 20200102 TRUE
3 20200103 TRUE
4 20200104 FALSE
5 20200105 FALSE
6 20200106 TRUE
CodePudding user response:
Not a one liner, but we can make the calculation using rowSums
, then select the desired columns. No need to pivot to long form.
library(tidyverse)
df1 %>%
mutate(final_result = rowSums(across(where(is.logical))) == 4) %>%
select(date, final_result)
Output
date final_result
1 20200101 FALSE
2 20200102 TRUE
3 20200103 TRUE
4 20200104 FALSE
5 20200105 FALSE
6 20200106 TRUE