I have the following dataframe:
df <- data.frame(
v1 = c(1,2,NA,2,2,NA,NA,1,NA,1),
v2 = c(1,2,1,1,2,NA,1,1,1,9),
v3 = c(2,2,NA,NA,2,9,NA,2,NA,2),
stringsAsFactors = FALSE
)
I would like to calculate the percentage of the 1, 2 and 9 values in each column.
I want the total to neglect the NA values as follows.
df2 <- data.frame(
v1 = c(50,50,0),
v2 = c(66.67,22.22,11.11),
v3 = c(16.66,66.66,16.67),
row.names = c(1,2,9),
stringsAsFactors = FALSE
)
thanks!
CodePudding user response:
An option is also with table
and proportions
library(dplyr)
df %>%
summarise(across(everything(),
~ round(proportions(table(factor(.x, levels = c(1, 2, 9)))) * 100, 2)) )
-output
v1 v2 v3
1 50 66.67 0.00
2 50 22.22 83.33
3 0 11.11 16.67
CodePudding user response:
vals = c(1, 2, 9)
result = as.data.frame(t(sapply(vals, \(x) 100 * colMeans(df == x, na.rm = TRUE))))
row.names(result) = vals
result
# v1 v2 v3
# 1 50 66.66667 0.00000
# 2 50 22.22222 83.33333
# 9 0 11.11111 16.66667