This is my function:
private def generateOneThousandRandomNumbers(listOfNumbers: List[String] = List.empty): List[String] = {
if (listOfNumbers.size == 1000) {
listOfNumbers
} else {
val nextNumber: String = Random.nextInt(10000000).toString
if (listOfNumbers.contains(nextNumber)) {
println("DUPLICATE NUMBER GENERATED: " nextNumber)
}
generateOneThousandRandomNumbers(listOfNumbers List(nextNumber))
}
}
And I have ten tests exactly like this:
"areUnique1" in {
val x = generateOneThousandRandomNumbers()
x.size shouldBe x.distinct.size
}
So by my calculations, with one test, it should only create a duplicate 1/10,000 runs, and with 10 tests it should only create a duplicate 1/1,000 runs. However, it is creating duplicates on about 50% of runs and I'm not sure why.
CodePudding user response:
According to the Birthday Paradox, you only need ~23 people in a room before there is a 50% chance of 2 of them sharing a birthday, despite the fact there are 365 different possible birthdays.
It's the same with your code: you have 10,000,000 different possible values, but if you put more than ~sqrt(10,000,000) ~= 3162
of them in a container, there will be a >50% chance of two of them being the same.
You're only putting 1000 in your container, so the chance of there being a collision isn't quite 50%, but it's still going to be pretty high.