Home > Blockchain >  How to calculate percentage of a specific value in a dataframe
How to calculate percentage of a specific value in a dataframe

Time:06-24

I have the following dataframe:

df <- data.frame(
    v1 = c(1,2,NA,2,2,NA,NA,1,NA,1),
    v2 = c(1,2,1,1,2,NA,1,1,1,9),
    v3 = c(2,2,NA,NA,2,9,NA,2,NA,2),
    stringsAsFactors = FALSE
  )

I would like to calculate the percentage of the 1, 2 and 9 values in each column.

I want the total to neglect the NA values as follows.

df2 <- data.frame(
    v1 = c(50,50,0),
    v2 = c(66.67,22.22,11.11),
    v3 = c(16.66,66.66,16.67),
    row.names = c(1,2,9),
    stringsAsFactors = FALSE
  )

thanks!

CodePudding user response:

An option is also with table and proportions

library(dplyr)
df %>% 
 summarise(across(everything(), 
  ~ round(proportions(table(factor(.x, levels = c(1, 2, 9)))) * 100, 2)) )

-output

 v1    v2    v3
1 50 66.67  0.00
2 50 22.22 83.33
3  0 11.11 16.67

CodePudding user response:

vals = c(1, 2, 9)
result = as.data.frame(t(sapply(vals, \(x) 100 * colMeans(df == x, na.rm = TRUE))))
row.names(result) = vals
result
#   v1       v2       v3
# 1 50 66.66667  0.00000
# 2 50 22.22222 83.33333
# 9  0 11.11111 16.66667
  • Related