Here is a sample of my data :
M<-read.table (text=" group value blue
B 12 Y
C 14 Y
A 12 Y
B 12 N
C 10 Y
A 7 Y
B 6 Y
", header=TRUE)
I want to have a sum for each group based on the value group_by(group) or using aggregate. Next will look at blue; if it is a "Y", then I sum them based on each group. For example, A, both are Y, so A=19. Now I can calculate p, i.e., 19/19*100. Here is the outcome that I got it.
group value P
A 19 100
B 30 60
C 24 100
CodePudding user response:
You could do:
library(tidyverse)
M %>%
group_by(group) %>%
summarize(P = 100 * sum(value[blue == "Y"])/sum(value),
value = sum(value)) %>%
select(1, 3, 2)
#> # A tibble: 3 x 3
#> group value P
#> <chr> <int> <dbl>
#> 1 A 19 100
#> 2 B 30 60
#> 3 C 24 100
Created on 2023-01-01 with reprex v2.0.2
CodePudding user response:
A dplyr
solution:
library(dplyr)
M %>%
count(group, blue, wt = value) %>%
group_by(group) %>%
summarise(N = sum(n), P = n[blue == 'Y'] / N * 100)
# A tibble: 3 × 3
group N P
<chr> <int> <dbl>
1 A 19 100
2 B 30 60
3 C 24 100
CodePudding user response:
'data.table' solution, assuming there are no NA's in value
. If not so, add na.rm = TRUE
to the sum-functions
library(data.table)
setDT(M)[, .(value = sum(value), P = 100 * sum(value[blue == "Y"]) / sum(value) ), keyby = .(group)]
# group value P
# 1: A 19 100
# 2: B 30 60
# 3: C 24 100