Finding Unique Arrays In a List-CodePudding

I'm building a game where users select from a list of symbols and later identify what those selections were.

Example:

Symbols: [A, B, C, D, E, F, G]

User selects [B, F, G] and this representative array [0, 1, 0, 0, 0, 1, 1] gets saved in the database

When a user correctly replicates a selection, I'd like to be able to award bonus points if the selection is unique.

By unique I mean if the selection is sufficiently different from all previous selections. What I've considered so far is to compare a user's selection array to all previous ones and check if the "variance" is at least some X value

Example previous selections:

[1, 1, 0, 0, 0, 0, 0]
[0, 1, 0, 0, 0, 0, 0]

User's selection: [0, 1, 0, 0, 0, 1, 1]

compare user's selection array to each previous and find how many indices differ

1)
       [0, 1, 0, 0, 0, 1, 1]
       [1, 1, 0, 0, 0, 0, 0] ---- variance = 3

2)
       [0, 1, 0, 0, 0, 1, 1]
       [0, 1, 0, 0, 0, 0, 0] ---- variance = 2

For a min variance (X) of 3, this user doesn't get any bonus points But for 2, he does

Is there a better way to think around and implement this?

EDITS

To clarify. I'm going to set the Minimum variance to 3. So after computing the variance between a user's rule and all other existing rules, if the minimum variance found is greater than 3, the user gets awarded the bonus points, otherwise not.

CodePudding user response：

You have a set of size 7 --> [A, B, C, D, E, F, G] and you can only have 2^7 = 128 possible previous values for representative arrays.

As soon as you have an efficient query to select distinct representative arrays from the database your algorithm should be running very fast, even it is not the best time complexity algorithm. It might not worth making your algorithm much more complex, but here is a good way to find the variance after you do new selection & previous selection : Count number of 1's in binary representation

In my view you should focus more on retrieving the previous distinct array representations from the database. You can do this by creating a uniq hash of your representation and store it with previous representation. For this problem, you have a great function to do it, just compute decimal value of your representation as below :

0 = [0, 0, 0, 0, 0, 0, 0] = 0 * 2^0   0 * 2^1   0 * 2^2   0 * 2^3   0 * 2^4   0 * 2^5   0 * 2^6   
1 = [1, 0, 0, 0, 0, 0, 0] = 1 * 2^0   0 * 2^1   0 * 2^2   0 * 2^3   0 * 2^4   0 * 2^5   0 * 2^6 
    .
    .
    .
    
127 = [1, 1, 1, 1, 1, 1, 1] = 1 * 2^0   1 * 2^1   1 * 2^2   1 * 2^3   1 * 2^4   1 * 2^5   1 * 2^6

When you select distinct values, it will guarantee you will do no more than 128 comparisons

CodePudding user response：

I think that you are looking for XOR ^ operator and then you can count number of "ones" to find the variance. You don't need to store array in the database, it can be just a simple INT type column.

In your example:

1)
0100011 -> 35   
1100000 -> 96

Then in javascript you could do:

a = parseInt("0100011", 2);   // it will give 35 (store only 35 as INT)
b = parseInt("1100000", 2);   // it will give 96 (store only 96 as INT)

// later when you want to calculate variance
varDec = a ^ b;                       // varDec = 67  
varBin = varDec.toString(2)           // varBin = "1000011"

Then you should count number of "ones" in your varBin variable. There are few answers on SO regarding this (i.e. How to find the number of 1's in a binary representation of a number? )

variance = varBin.split('1').length-1;    // variance = 3

CodePudding user response：

I may be missing something here, but this sounds like a fairly simple problem. We need a function to calculate the number of indices at which the elements in two arrays don't match, call it variance, and we need a function that runs that for a current selection against a list of previous ones and takes the minimum value, call it minVariance. This snippet has fairly simple versions of both:

const variance = (xs) => (ys) => 
  xs .reduce ((c, x, i) => x == ys [i] ? c : c   1, 0)

const minVariance = (curr) => (prevs) => 
  Math .min (... prevs .map (variance (curr)))

const prevs = [
  [1, 1, 0, 0, 0, 0, 0],
  [0, 1, 0, 0, 0, 0, 0],
]
const curr = [0, 1, 0, 0, 0, 1, 1]

console .log (minVariance (curr) (prevs))

As others have pointed out, there are techniques that might help you save further on storage cost, but in the end you will need to compare bitmaps or integers, or convert to the arrays you were already planning on. Our minVariance can be stay the same for those, changing only the simple variance.