Calculation Proportion in R with Loop?-CodePudding

I have a similar dataset to this:

> dput(df)
structure(list(Surgeon = c("John Smith", "John Smith", "John Smith", 
"John Smith", "John Smith", "John Smith", "John Smith", "Martin Harris", 
"Martin Harris", "Martin Harris", "Kyle Short"), Blood.Order = c("ABC", 
"ABC", "DEF", "ABC", "IJK", "ABC", "DEF", "IJK", "ABC", "ABC", 
"DEF"), Status = c("Returned", "Wasted", "Returned", "Returned", 
"Wasted", "Wasted", "Wasted", "Returned", "Wasted", "Returned", 
"Wasted")), class = "data.frame", row.names = c(NA, -11L))

I want to calculate how much blood (Blood.Order) each surgeon wasted as a function of how many surgeries they performed.

For example, we see that John Smith performed 7 surgeries. Out of these 7 surgeries, he wasted blood 4 times. So this calculation should be 4/7=0.5714286.

I want to create a loop that does this for each surgeon (find out how much blood each surgeon wasted per how many surgeries total they performed).

A bar graph showing how much blood each surgeon wasted would be helpful, to see which surgeon(s) waste the most blood.

Thanks!

CodePudding user response：

We can do this without a loop i.e. grouped by 'Surgeon', get the mean of logical vector (Status == "Wasted")

library(dplyr)
out <- df %>% 
   group_by(Surgeon) %>% 
   summarise(Prop = mean(Status == "Wasted"))

-output

out
# A tibble: 3 × 2
  Surgeon        Prop
  <chr>         <dbl>
1 John Smith    0.571
2 Kyle Short    1    
3 Martin Harris 0.333

If we need a bar plot

library(ggplot2)
ggplot(out, aes(x = Surgeon, y = Prop))   geom_col()

Or using base R

barplot(proportions(table(df[-2]), 1)[,2])