Home > database >  How to generate a numpy array with random values that are all different from each other
How to generate a numpy array with random values that are all different from each other

Time:11-14

I've just started to learn about Python libraries and today I got stuck at a numpy exercise.

Let's say I want to generate a 5x5 random array whose values are all different from each other. Further, its values have to range from 0 to 100.

I have already look this up here but found no suitable solution to my problem. Please see the posts I consulted before turning to you:

Numpy: Get random set of rows from 2D array Numpy Random 2D Array

Here, my attempt at this:

import numpy as np

M = np.random.randint(1, 101, size=25)

for x in M:
    for y in M:
        if x in M == y in M:
            M = np.random.randint(1, 101, size=25)
            print(M)

By doing this, all I get is a value error: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Thus, my second attempt has been the following:

M = np.random.randint(1, 101, size=25)
a = M[x]
b = M[y]

for a.any in M:
    for b.any in M:
        if a.any in M == b.any in M:
            M = np.random.randint(1, 101, size=25)
            print(M)

Once again, I got an error: AttributeError: 'numpy.int64' object attribute 'any' is read-only.

Could you please let me know what am I doing wrong? I've spent days going over this but nothing else comes to mind :(

Thank you so much!

CodePudding user response:

Not sure if this will be ok for all your needs, but it will work for your example:

np.random.choice(np.arange(100, dtype=np.int32), size=(5, 5), replace=False)

CodePudding user response:

You can use

np.random.random((5,5))

to generate an array of random numbers from 0 to 1, with shape (5,5).

Then just multiply by 100 to get the numbers between 0 and 100:

100*np.random.random((5,5))

Your code gives an error because of this line:

if x in M == y in M:

That syntax doesn't work. And M == y is a comparison between an array and a number, which is why you get ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

You already defined x and y as elements of M in the for loops so just write

if x == y:

CodePudding user response:

While I think choice without replacement is the easy way to go, here's a way implementing your approach. I use np.unique to test whether all the numbers are different.

In [158]: M = np.random.randint(1,101, size=25)
In [159]: x = np.unique(M)
In [160]: x.shape
Out[160]: (23,)
In [161]: cnt = 1
     ...: while x.shape[0]<25:
     ...:     M = np.random.randint(1,101,size=25)
     ...:     x = np.unique(M)
     ...:     cnt  = 1
     ...: 
In [162]: cnt
Out[162]: 34
In [163]: M
Out[163]: 
array([ 70,  76,  27,  98,  81,  92,  97,  38,  49,   7,   2,  55,  85,
        89,  32,  51,  20, 100,   9,  91,  53,   3,  11,  63,  21])

Note that I had to generate 34 random sets before I got a unique one. That is a lot of work!

unique works by sorting the array, and then looking for adjacent duplicates.

A numpy whole-array approach to testing whether there are any duplicates is:

For this unique set:

In [165]: (M[:,None]==M).sum(axis=0)
Out[165]: 
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

but for another randint array:

In [166]: M1 = np.random.randint(1,101, size=25)
In [168]: (M1[:,None]==M1).sum(axis=0)
Out[168]: 
array([1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1,  2, 2, 1])

This M[:,None]==M is roughly the equivalent of

for x in M:
    for y in M:
        test x==y

I'm using sum to count how many True values there are in each column. any just identifies if there is a True.

  • Related