Compute Proportion Across Multiple Factors-CodePudding

I have a dataframe that looks like this:

structure(list(Cell.Class = c("Excitatory Neurons", "Inhibitory", 
"OPCs", "Medium Spiny Neurons", "Excitatory Neurons", "Inhibitory", 
"OPCs", "Medium Spiny Neurons", "Excitatory Neurons", "Inhibitory", 
"OPCs", "Medium Spiny Neurons"), Mean.Enrichment = c(2.8, 3, 
0.4, 0.42, 18, 2.1, 0.8, 2.8, 0.3, 1, 0, 0), Disorder = c("Bipolar Disorder", 
"Bipolar Disorder", "Bipolar Disorder", "Bipolar Disorder", "Schizophrenia", 
"Schizophrenia", "Schizophrenia", "Schizophrenia", "Major Depression", 
"Major Depression", "Major Depression", "Major Depression")), class = "data.frame", row.names = c(NA, 
-12L))

> enrichment.means
             Cell.Class Mean.Enrichment         Disorder
1    Excitatory Neurons            2.80 Bipolar Disorder
2            Inhibitory            3.00 Bipolar Disorder
3                  OPCs            0.40 Bipolar Disorder
4  Medium Spiny Neurons            0.42 Bipolar Disorder
5    Excitatory Neurons           18.00    Schizophrenia
6            Inhibitory            2.10    Schizophrenia
7                  OPCs            0.80    Schizophrenia
8  Medium Spiny Neurons            2.80    Schizophrenia
9    Excitatory Neurons            0.30 Major Depression
10           Inhibitory            1.00 Major Depression
11                 OPCs            0.00 Major Depression
12 Medium Spiny Neurons            0.00 Major Depression

I want to calculate the proportion of Mean.Enrichment that each Cell Class has, for each Disorder. I tried the following:

  group_by(Disorder) %>% 
  mutate(Sum= sum(Mean.Enrichment)) %>%
  group_by(Disorder, .add=TRUE) %>%
  summarise(Proportion = Mean.Enrichment/Sum)

And the result

   Disorder         Proportion
   <chr>                 <dbl>
 1 Bipolar Disorder     0.423 
 2 Bipolar Disorder     0.453 
 3 Bipolar Disorder     0.0604
 4 Bipolar Disorder     0.0634
 5 Major Depression     0.231 
 6 Major Depression     0.769 
 7 Major Depression     0     
 8 Major Depression     0     
 9 Schizophrenia        0.759 
10 Schizophrenia        0.0886
11 Schizophrenia        0.0338
12 Schizophrenia        0.118

But it's not telling me which Cell.Class corresponds to which calculation here. How can I achieve that?

CodePudding user response：

If I understand. Maybe it.

enrichment.means %>%
group_by(Disorder) %>% 
mutate(Sum = sum(Mean.Enrichment),
       Proportion = Mean.Enrichment/Sum)

If possible add an expected output.

CodePudding user response：

My solution - use across(everything()) to explicitly keep the columns.

     group_by(Disorder) %>% 
     mutate(Sum = sum(Mean.Enrichment)) %>%
     summarise(across(everything()),Proportion = Mean.Enrichment/Sum)

   Disorder         Cell.Class           Mean.Enrichment   Sum Proportion
   <chr>            <chr>                          <dbl> <dbl>      <dbl>
 1 Bipolar Disorder Excitatory Neurons              2.8   6.62     0.423 
 2 Bipolar Disorder Inhibitory                      3     6.62     0.453 
 3 Bipolar Disorder OPCs                            0.4   6.62     0.0604
 4 Bipolar Disorder Medium Spiny Neurons            0.42  6.62     0.0634
 5 Major Depression Excitatory Neurons              0.3   1.3      0.231 
 6 Major Depression Inhibitory                      1     1.3      0.769 
 7 Major Depression OPCs                            0     1.3      0     
 8 Major Depression Medium Spiny Neurons            0     1.3      0     
 9 Schizophrenia    Excitatory Neurons             18    23.7      0.759 
10 Schizophrenia    Inhibitory                      2.1  23.7      0.0886
11 Schizophrenia    OPCs                            0.8  23.7      0.0338
12 Schizophrenia    Medium Spiny Neurons            2.8  23.7      0.118