Calculate positivity rate (positives/total observations)-CodePudding

I am attempting to calculate the positivity rate per person, i.e. (# of 1s per person/total number observations per person). My data set looks similar to this:

person	outcome
a	1
a	1
a	0
a	0
b	1
b	0
b	0
c	1
c	1

I am hoping to return something that looks like this:

person	positiverate
a	0.50
b	0.33
c	1.00

I feel like this should be a fairly simple code, but I have been unable to figure it out thus far.

CodePudding user response：

We may use a group by mean

library(dplyr)
df1 %>% 
    group_by(person) %>% 
    summarise(positiverate = mean(outcome))

-output

# A tibble: 3 × 2
  person positiverate
  <chr>         <dbl>
1 a             0.5  
2 b             0.333
3 c             1

data

df1 <- structure(list(person = c("a", "a", "a", "a", "b", "b", "b", 
"c", "c"), outcome = c(1L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 1L)), 
class = "data.frame", row.names = c(NA, 
-9L))

CodePudding user response：

Base R:

aggregate( . ~ person ,df, mean)
# or if you prefere to have positiverate as column name
aggregate( cbind(positiverate = outcome) ~ person ,df, mean)

data.table for faster data manipulation:

library(data.table)
setDT(df)[,'.'(positiverate = mean(outcome)), by = person]

CodePudding user response：

Try tapply():

tapply(X = df$outcome, INDEX = df$person, FUN=mean)