I have a long data frame that looks like this:
Product | Price | Decision | Sum |
---|---|---|---|
Food | 1 | yes | 39 |
Food | 1 | no | 234 |
Food | 2 | yes | 1312 |
Food | 2 | no | 3123 |
Clothes | 1 | yes | 323 |
Clothes | 1 | no | 232 |
Clothes | 3 | yes | 3 |
Clothes | 3 | no | 434 |
I want a code that creates a new data frame that groups by the Product, Price and Decision and calculates:
(Sum for decision = yes) / ((Sum for decision = no) (sum for decision = yes))
So for example, for Food with the Price 1:
39 / (234 39) = 0.1428571
In the real data set I have 6 different Products and each has prices from 0 - 99.
The new data frame should look like this:
Product | Price | Decision |
---|---|---|
Food | 1 | 0.1428571 |
Food | 2 | 0.295829 |
Clothes | 1 | 0.581982 |
Clothes | 3 | 0.006865 |
CodePudding user response:
Here's a way using proportions
:
df %>%
group_by(Product, Price) %>%
mutate(prop = proportions(Sum)) %>%
filter(Decision == "yes")
Product Price Decision Sum prop
<chr> <int> <chr> <int> <dbl>
1 Food 1 yes 39 0.143
2 Food 2 yes 1312 0.296
3 Clothes 1 yes 323 0.582
4 Clothes 3 yes 3 0.00686
CodePudding user response:
Does this work:
library(dplyr)
df %>% group_by(Product, Price) %>%
summarise(Decision = Sum[Decision == 'yes']/sum(Sum))
`summarise()` has grouped output by 'Product'. You can override using the `.groups` argument.
# A tibble: 4 × 3
# Groups: Product [2]
Product Price Decision
<chr> <dbl> <dbl>
1 Clothes 1 0.582
2 Clothes 3 0.00686
3 Food 1 0.143
4 Food 2 0.296