I'm trying to calculate the sum multinomial distributions. Think of this as elections: Suppose there are 11 voters and three candidates. Candidate A has a 0.5 probability of being chosen, B has 0.3 and C has 0.2. I am interested in calculating the probability that A wins.
So I am taking all the possible scenarios in which A wins under plurality vote (the option which has the most votes wins) and calculating the probability of them happening and then summing all these values.
My problem comes when I try to calculate the multinomial distribution of each individual scenario in which A wins.
I have a dataframe with all the possible outcomes in which A wins and ideally I want a new column in which the probability of each happening is shown. Sort of like this:
V1 V2 V3 dmultinom(x = c(5, 3, 3), prob = options)
1 5 2 4 0.06237
2 5 3 3 0.06237
3 5 4 2 0.06237
4 6 0 5 0.06237
5 6 1 4 0.06237
6 6 2 3 0.06237
7 6 3 2 0.06237
8 6 4 1 0.06237
9 6 5 0 0.06237
10 7 0 4 0.06237
11 7 1 3 0.06237
12 7 2 2 0.06237
13 7 3 1 0.06237
14 7 4 0 0.06237
15 8 0 3 0.06237
16 8 1 2 0.06237
17 8 2 1 0.06237
18 8 3 0 0.06237
19 9 0 2 0.06237
20 9 1 1 0.06237
21 9 2 0 0.06237
22 10 0 1 0.06237
23 10 1 0 0.06237
24 11 0 0 0.06237
But with the right values, of course.
I tried to access the values of the rows using $
but with no success. I also tried to create a new column with the values of the rows as vectors using dsplyr
but couldn't do it either.
CodePudding user response:
Here's an option with data.table
library(data.table)
#create data.frame
xx <- data.frame(V1 = c(5, 5, 5, 6),
V2 = c(2, 3, 4, 0),
V3 = c(4, 3, 2, 5))
#convert the data.frame to a data.table
setDT(xx)
#put the data in long format
xx <- data.table::melt(xx,
measure.vars = names(xx))
#make a grouping variable
xx[, group := rep(1:4, 3)]
#apply function to each group
xx[, probability := dmultinom(value, prob = c(0.5, 0.3, 0.2)), by = "group"]
#pivot data back to wider format
yy <- data.table::dcast(xx[, !c("group")],
probability ~ variable,
value.var = "value")
> yy
probability V1 V2 V3
1: 0.00231000 6 0 5
2: 0.03118500 5 2 4
3: 0.06237000 5 3 3
4: 0.07016625 5 4 2
CodePudding user response:
Not the prettiest solutions, but it can be accomplished with a for-loop.
First, creating the empty column:
dat$multinom <- 0
Next, iterate through the dataframe adding the multinomial with the inputs from V1, V2 and V3
for (i in 1:nrow(dat)) {
dat$multinom[i] <- dmultinom(x = c(dat$V1[i], dat$V2[i], dat$V3[i]), prob = options)
}