I have a df
below, I need to calculate Pass
percent after excluding -
rows and the average values of pred1
and pred2
:
df <- data.frame(
name = c('A', 'B', 'C', 'D', 'E'),
status = c('Pass', 'Fail', '-', 'Pass', 'Pass'),
real = c(10, NA, 8, 9, 4),
pred1 = c(50, 20, NA, 14, 11),
pred2 = c(12, 12, 8, NA, 6)
)
df:
name status real pred1 pred2
1 A Pass 10 50 12
2 B Fail NA 20 12
3 C - 8 NA 8
4 D Pass 9 14 NA
5 E Pass 4 11 6
The expected result:
name status real pred1 pred2
1 A Pass 10 50 12
2 B Fail NA 20 12
3 C - 8 NA 8
4 D Pass 9 14 NA
5 E Pass 4 11 6
6 total 0.75 NA 23.75 9.5
I thought to bind to the result below to df
, but it's not concise and beautiful solutions:
pass_percent <- nrow(df %>% filter(status == 'Pass')) / nrow(df %>% filter(status != '-'))
avg_pred1 <- mean(df$pred1, na.rm = T)
avg_pred2 <- mean(df$pred2, na.rm = T)
How could I acheive that in a more concise way with R's pipe?
CodePudding user response:
What about tibble::add_row
:
df %>%
add_row(name = "total",
status = as.character(mean(df$status[df$status != "-"] == "Pass")),
real = mean(df$real),
pred1 = mean(df$pred1, na.rm = T),
pred2 = mean(df$pred2, na.rm = T))
name status real pred1 pred2
1 A Pass 10 50.00 12.0
2 B Fail NA 20.00 12.0
3 C - 8 NA 8.0
4 D Pass 9 14.00 NA
5 E Pass 4 11.00 6.0
6 total 0.75 NA 23.75 9.5
Explanation of as.character(mean(df$status[df$status != "-"] == "Pass"))
:
df$status[df$status != "-"]
is the vector ofdf$status
without the element equal to"-"
(so onlyPass
andFail
).df$status[df$status != "-"] == "Pass"
isTRUE
ifdf$status
is"Pass"
,FALSE
otherwise.mean(...)
is possible because TRUE and FALSE values are coerced to numeric when the mean is computed.as.character(...)
is needed becausedf$status
is a character variable.