Suppose we start with the below dataframe, generated via code immediately beneath:
> data
To A B C
1 A 1 3 5
2 B 2 4 6
3 C 4 5 7
data <-
data.frame(
To = c("A","B","C"),
A = c(1,2,4),
B = c(3,4,5),
C = c(5,6,7)
)
Now we add column and row totals resulting in the below revised dataframe, generated by the code shown immediately beneath:
> data
To A B C Sum
1 A 1 3 5 9
2 B 2 4 6 12
3 C 4 5 7 16
4 Sum 7 12 18 37
data <- data %>%
replace(is.na(.), 0) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
data <- cbind(data, Sum = rowSums(data[,-1]))
Finally, we calculate the percentages of the column totals represented by each element in data
:
To A B C Sum
1 A 0.1428571 0.2500000 0.2777778 0.2432432
2 B 0.2857143 0.3333333 0.3333333 0.3243243
3 C 0.5714286 0.4166667 0.3888889 0.4324324
4 Sum 1.0000000 1.0000000 1.0000000 1.0000000
library(tidyverse)
data %>% mutate(across(-c(To), ~ ./.[To == "Sum"]))
Question: how to modify the above code so that we instead calculate the percentage of row
totals, instead of column totals? So we would end up with the below proportions (hand calculated so please pardon any minor errors, and don't worry about rounding):
To A B C Sum
1 A 0.1111111 0.3333333 0.5555555 1.0000000
2 B 0.1666666 0.3333333 0.5000000 1.0000000
3 C 0.2500000 0.3125000 0.4375000 1.0000000
4 Sum 0.5277771 0.9791666 1.4930555 3.0000000
CodePudding user response:
Try this. (Note row 4 also sums across to 1.)
library(tidyverse)
data <-
data.frame(
To = c("A","B","C"),
A = c(1,2,4),
B = c(3,4,5),
C = c(5,6,7)
)
data <- data %>%
replace(is.na(.), 0) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
data <- cbind(data, Sum = rowSums(data[,-1]))
data %>%
rowwise() %>%
mutate(across(A:Sum, ~ sum(.) / Sum))
#> # A tibble: 4 × 5
#> # Rowwise:
#> To A B C Sum
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 A 0.111 0.333 0.556 1
#> 2 B 0.167 0.333 0.5 1
#> 3 C 0.25 0.312 0.438 1
#> 4 Sum 0.189 0.324 0.486 1
Created on 2022-05-04 by the reprex package (v2.0.1)
CodePudding user response:
A possible solution:
library(dplyr)
data %>%
mutate(Sum = rowSums(across(A:C))) %>%
mutate(across(A:Sum, ~ .x / Sum)) %>%
bind_rows(data.frame(To = "Sum", t(colSums(.[-1]))))
#> To A B C Sum
#> 1 A 0.1111111 0.3333333 0.5555556 1
#> 2 B 0.1666667 0.3333333 0.5000000 1
#> 3 C 0.2500000 0.3125000 0.4375000 1
#> 4 Sum 0.5277778 0.9791667 1.4930556 3