I've just started to learn about Python libraries and today I got stuck at a numpy exercise.
Let's say I want to generate a 5x5 random array whose values are all different from each other. Further, its values have to range from 0 to 100.
I have already look this up here but found no suitable solution to my problem. Please see the posts I consulted before turning to you:
Numpy: Get random set of rows from 2D array Numpy Random 2D Array
Here, my attempt at this:
import numpy as np
M = np.random.randint(1, 101, size=25)
for x in M:
for y in M:
if x in M == y in M:
M = np.random.randint(1, 101, size=25)
print(M)
By doing this, all I get is a value error: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Thus, my second attempt has been the following:
M = np.random.randint(1, 101, size=25)
a = M[x]
b = M[y]
for a.any in M:
for b.any in M:
if a.any in M == b.any in M:
M = np.random.randint(1, 101, size=25)
print(M)
Once again, I got an error: AttributeError: 'numpy.int64' object attribute 'any' is read-only.
Could you please let me know what am I doing wrong? I've spent days going over this but nothing else comes to mind :(
Thank you so much!
CodePudding user response:
Not sure if this will be ok for all your needs, but it will work for your example:
np.random.choice(np.arange(100, dtype=np.int32), size=(5, 5), replace=False)
CodePudding user response:
You can use
np.random.random((5,5))
to generate an array of random numbers from 0 to 1, with shape (5,5).
Then just multiply by 100 to get the numbers between 0 and 100:
100*np.random.random((5,5))
Your code gives an error because of this line:
if x in M == y in M:
That syntax doesn't work. And M == y
is a comparison between an array and a number, which is why you get ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().
You already defined x
and y
as elements of M
in the for
loops so just write
if x == y:
CodePudding user response:
While I think choice
without replacement is the easy way to go, here's a way implementing your approach. I use np.unique
to test whether all the numbers are different.
In [158]: M = np.random.randint(1,101, size=25)
In [159]: x = np.unique(M)
In [160]: x.shape
Out[160]: (23,)
In [161]: cnt = 1
...: while x.shape[0]<25:
...: M = np.random.randint(1,101,size=25)
...: x = np.unique(M)
...: cnt = 1
...:
In [162]: cnt
Out[162]: 34
In [163]: M
Out[163]:
array([ 70, 76, 27, 98, 81, 92, 97, 38, 49, 7, 2, 55, 85,
89, 32, 51, 20, 100, 9, 91, 53, 3, 11, 63, 21])
Note that I had to generate 34 random sets before I got a unique one. That is a lot of work!
unique
works by sorting the array, and then looking for adjacent duplicates.
A numpy
whole-array approach to testing whether there are any duplicates is:
For this unique set:
In [165]: (M[:,None]==M).sum(axis=0)
Out[165]:
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
but for another randint
array:
In [166]: M1 = np.random.randint(1,101, size=25)
In [168]: (M1[:,None]==M1).sum(axis=0)
Out[168]:
array([1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1])
This M[:,None]==M
is roughly the equivalent of
for x in M:
for y in M:
test x==y
I'm using sum
to count how many True
values there are in each column. any
just identifies if there is a True.