Home > Mobile >  Calculated column based in a subtotal
Calculated column based in a subtotal

Time:09-13

I have a dataframe with frequencies like this:

data <- data.frame(id = c("1","2","3","4","5","6","7","8"),
                   level = c("high", "high", "high", "med", "med", "med", "low", "low"),
                   n = c(24, 48, 57, 79, 2, 69, 37, 82))

What I'm trying to do is to create a new column to get the percentage of each observation but taking as reference the level subtotal. This is my solution, but trying to make it more efficient (the more values in level, the more lines to calculate subtotals):

high <- sum(data$n[data$level == "high"])
med <- sum(data$n[data$level == "med"])
low <- sum(data$n[data$level == "low"])

data$prop <- NA
data$prop[data$level == "high"] <- (data$n/high)*100
data$prop[data$level == "med"] <- (data$n/med)*100
data$prop[data$level == "low"] <- (data$n/low)*100

CodePudding user response:

We may use proportions after a grouping by 'level'

library(dplyr)
data %>% 
  group_by(level) %>% 
  mutate(prop = proportions(n)* 100) %>%
  ungroup

-output

# A tibble: 8 × 4
  id    level     n  prop
  <chr> <chr> <dbl> <dbl>
1 1     high     24 18.6 
2 2     high     48 37.2 
3 3     high     57 44.2 
4 4     med      79 52.7 
5 5     med       2  1.33
6 6     med      69 46   
7 7     low      37 31.1 
8 8     low      82 68.9 

Or with data.table

library(data.table)
setDT(data)[, prop := proportions(n) * 100, level]

CodePudding user response:

Just for laffs, here's an answer in base:

transform(data, percent = ave(n, level, FUN = function(x) 100 * prop.table(x)))

#   id level  n   percent
# 1  1  high 24 18.604651
# 2  2  high 48 37.209302
# 3  3  high 57 44.186047
# 4  4   med 79 52.666667
# 5  5   med  2  1.333333
# 6  6   med 69 46.000000
# 7  7   low 37 31.092437
# 8  8   low 82 68.907563
  • Related