I hope this is the right forum for this kind of general question, please tell me if I should address this to another place.
I have a grid NxN on which I want to place 'randomically' M items, each of these items has a unique Id (integer). Anytime I do this placement with the same set of items (same Ids) I want the same distribution for each item.
Basically I need to build a placement function that takes as input the array of Ids and gives as output an array of coordinates but I have no clue how to start. Any suggestions?
CodePudding user response:
If you try to generate individual locations based on the individual ID values, you're likely to encounter collisions due to the birthday problem. You can deal with this using techniques developed for hash tables, but those get clunky and potentially expensive as M approaches N*N, so I've interpreted your question to mean that the same set of IDs will generate the same set of grid locations.
The solution presented below assumes that you have access to a PRNG which produces an identical stream of values when given the same seed value. The ID values (or their hashes if they're non-numeric) are pooled in a repeatable way to generate the seed. I went with XORing, which avoids possible overflow issues, but you can substitute your own favorite mapping from the ID set to a seed value. The PRNG is then used to shuffle a list of the integers from 0 to (N*N)-1, and the first M values are picked off. Those are mapped into 2-d coordinates using an old trick involving division to get the row index and modulus to get the column index. Since the M values are guaranteed to be unique, so are the corresponding grid indices. This assumes array indexing is zero-based, shift by adding 1 for languages which are one-based. It also assumes that you have access to a shuffle method. If not, it's quite easy to implement a Fisher-Yates shuffle.
Without further ado, here's a heavily annotated implementation in Ruby:
def grid_locs(ids, n)
fail "More IDs than grid locations!" if ids.length > n*n
# Generate a seed by XORing all of the ids with 0xFFFFFFFF
id_based_seed = ids.inject(0xFFFFFFFF) { |memo, x| memo ^ x }
# Use this to seed the built-in PRNG
srand(id_based_seed)
# Create an array containing the numbers 0 up to N*N - 1
unique_values = Array.new(n*n) { |i| i }
# Shuffle the array, pick off the first M values, convert them
# to 2-d indices using the ancient division & modulo trick, and
# return the results as a set of M unique grid locations
return unique_values.shuffle[0...ids.size].map do |index|
[index / n, index % n]
end
end
# Helper to print the 2-d grid
def print_2d(ary)
ary.each { |row| p row }
puts
end
# # I initially generated random IDs, but then stored them for reproducibility
# id = Array.new(5) { rand(1..0xFFFFFFFF) }
ids = [4146364446, 2782840243, 4137630311, 3993445117, 3133691672]
# Generate and print a grid of asterisks
two_d_table = Array.new(4) { |i| Array.new(4) { '*' } }
puts "Before:"
print_2d two_d_table
# Generate and print grid locations using the method above
locations = grid_locs(ids, two_d_table.size)
puts "Locations: #{locations.inspect}"
puts
# Place ID index at each of the generated locations and print the results
locations.each.with_index { |loc, i| two_d_table[loc[0]][loc[1]] = i }
puts "After:"
print_2d two_d_table
which produces:
Before:
["*", "*", "*", "*"]
["*", "*", "*", "*"]
["*", "*", "*", "*"]
["*", "*", "*", "*"]
Locations: [[0, 2], [3, 2], [3, 1], [1, 2], [2, 1]]
After:
["*", "*", 0, "*"]
["*", "*", 3, "*"]
["*", 4, "*", "*"]
["*", 2, 1, "*"]
The locations are identical given the same set of indices, but will change based on different numbers of indices or different index values.