Home > other >  How to add multiple independent probabilities to determine the overall probability of a single outpu
How to add multiple independent probabilities to determine the overall probability of a single outpu

Time:09-06

I apologize in advance for any confusing explanations, but I will try to be as clear as possible.

If there are multiple indicators that predict an outcome with a known accuracy, and they are all attempting to predict the same result, how do you properly add the probabilities?

For example, if John and David are taking a test, and historically John answers 80% of questions correctly, and David answers 75% of questions correctly, and both John and David select the same answer on a question, what is the probability that they are correct? Let's assume that John and David are completely independent of each other and that all questions are equally difficult.

I would think that the probability that they are correct is higher than 80%, so I don't think averaging makes sense.

CodePudding user response:

Thank you to Robert who commented on this question, I was able to figure out that what I was looking for is a well-known problem solved by Bayes Theorem, which is used to re-evaluate existing probabilities given new information. I won't go further into the intuition behind it but 3Blue1Brown has a very good video on the topic.

Bayes Theorem states: P(A|B) = (P(A)*P(B)) / (P(A)*P(B) P(!A)*P(!B))

Where: P(A) is probability 1, P(!A) is 1 - P(A), P(B) is probability 2, and P(!B) is 1 - P(B)

Using this equation in the scenario in the question, if John has an 80% chance of being right and David has a 75% chance of being right, and both agree, then the chance that they are both correct is 92.3%.

To prove this, I wrote a simple python script that simulates this exact scenario n times and prints out the result. In this code, two "experts" have a set probability of being true or false, and their accuracy is tracked individually and together.

import random

TRIALS = 1000000

exp1_correct = 0
exp2_correct = 0
combined_correct = 0
consensus_count = 0

for i in range(TRIALS):
    expert1 = random.random() <= 0.8
    expert2 = random.random() <= 0.75

    if expert1 and expert2:
        combined_correct  = 1
    if expert1:
        exp1_correct  = 1
    if expert2:
        exp2_correct  = 1
    if expert1 == expert2:
        consensus_count  = 1

print(f'Expert 1 had an accuracy of {exp1_correct / TRIALS}')
print(f'Expert 2 had an accuracy of {exp2_correct / TRIALS}')
print(f'Consensus had an accuracy of {combined_correct / consensus_count}')

Running this verifies that the equation above is correct. Hopefully this is helpful to someone that has the same question that I did!

  • Related