Home > Software engineering >  Calculate positivity rate (positives/total observations)
Calculate positivity rate (positives/total observations)

Time:10-08

I am attempting to calculate the positivity rate per person, i.e. (# of 1s per person/total number observations per person). My data set looks similar to this:

person outcome
a 1
a 1
a 0
a 0
b 1
b 0
b 0
c 1
c 1

I am hoping to return something that looks like this:

person positiverate
a 0.50
b 0.33
c 1.00

I feel like this should be a fairly simple code, but I have been unable to figure it out thus far.

CodePudding user response:

We may use a group by mean

library(dplyr)
df1 %>% 
    group_by(person) %>% 
    summarise(positiverate = mean(outcome))

-output

# A tibble: 3 × 2
  person positiverate
  <chr>         <dbl>
1 a             0.5  
2 b             0.333
3 c             1    

data

df1 <- structure(list(person = c("a", "a", "a", "a", "b", "b", "b", 
"c", "c"), outcome = c(1L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 1L)), 
class = "data.frame", row.names = c(NA, 
-9L))

CodePudding user response:

Base R:

aggregate( . ~ person ,df, mean)
# or if you prefere to have positiverate as column name
aggregate( cbind(positiverate = outcome) ~ person ,df, mean)

data.table for faster data manipulation:

library(data.table)
setDT(df)[,'.'(positiverate = mean(outcome)), by = person]

CodePudding user response:

Try tapply():

tapply(X = df$outcome, INDEX = df$person, FUN=mean)
  • Related