I want to create a large array of random numbers drawn from the gaussian distribution. I found dlarnv, but I am not sure how to use it in Swift. Specifically, the type signature XCode shows is as follows:
dlarnv_(
__idist: UnsafeMutablePointer<__CLPK_integer>,
__iseed: UnsafeMutablePointer<__CLPK_integer>,
__n: UnsafeMutablePointer<__CLPK_integer>,
__x: UnsafeMutablePointer<__CLPK_doublereal>
)
How do I use this to fill an array with single precision floating point numbers? This is how far I have gotten:
n = 10000
var data: [Float]
data.reserveCapacity(n)
data
dlarnv_(
3, // for normal distribution
seed, // not sure how to seed
n,
data, // not sure how to pass a pointer
)
CodePudding user response:
If you want a lot of values at one time, and you want them in a standard normal distribution (µ=0, σ=1), it's very hard to beat dlarnv_
. But if you want a bit more flexibility, you should also consider GKLinearCongruentialRandomSource, and a little math. The following is about 20% slower than dlarnv_
for fetching 10M values all at once, but it's 5x faster if you want one value at a time (including adjusting the mean and stddev).
import GameplayKit
let random = GKLinearCongruentialRandomSource()
func randomNormalValue(average: Double, standardDeviation: Double) -> Double {
let x1 = Double(random.nextUniform())
let x2 = Double(random.nextUniform())
let z1 = sqrt(-2 * log(x1)) * cos(2 * .pi * x2)
return z1 * standardDeviation average
}
I'm not sure why GKGaussianDistribution isn't implemented this way (or one of the other solutions that are possibly even faster, but I haven't bothered to implement and test). I agree that it's slow.
But it's not that slow. It's about 75% slower than dlarnv_
for 10M random values. Which is slow, but still in the same order-of-magnitude. The issue is the random source itself. Most folks write this:
let random = GKRandomSource()
And that's definitely the safest answer. But that's a cryptographically-secure source of entropy. If you're doing anything where the number needs to be really random, that's what you should be using (and dlarnv_
doesn't do this, so it's unsafe in some cases).
But if you just need "kinda random, but no one is going to try to exploit the fact that it's not really random," the source you want is GKLinearCongruentialRandomSource. And that gets GKGaussianDistribution into the same ballpark as Accelerate (factor of 2, not orders of magnitude).
CodePudding user response:
Using dlarnv_
is much faster than using GameplayKit
var n: Int32 = 500 // Array size
var d: Int32 = 3 // 3 for Normal(0, 1)
var seed: [Int32] = [1, 1, 1, 1] \\ Ideally pick a random seed
var x: [Double] = Array<Double>(unsafeUninitializedCapacity: Int(n)) { buffer, count in
dlarnv_(&d, &seed, &n, buffer.baseAddress)
count = Int(n)
}