Inputs are:
A Panda Dataframe with 500 columns and 10 lines, which contains a series of random integers comprised between 0 and 10000 (included)
A list of 10 random integers comprised between 0 and 10000
The output I am looking for is:
A Panda Dataframe with 500 columns and 10 lines, which gives the Boolean true or false depending if the element from the x-th line is above (true) or below (false) the number which is the x-th element of the list
I was able to solve this in excel using the following functions:
- =RANDARRAY(10,1,0,10000,TRUE)
- =IF(RANDARRAY(10,500,0,10000,TRUE)>A1,TRUE,FALSE)
Is there an elegant way of producing this solution in python? I am still a beginner learning more about python.
Thank you for the help
Update: Using MSS's solution, this is my final code. Could you please tell me if there are any mistakes in my code?
import numpy as np
import pandas as pd
import random
df = pd.DataFrame(np.random.randint(0,10000,size=(10, 500)))
df.head
list = random.sample(range(10000), 10)
print(list)
a = df.to_numpy()
b = np.array(list)
res = pd.DataFrame(a > b[:,None], index= df.index, columns=df.columns)
print(res)
Thank you for the help
CodePudding user response:
You can do it in this way using numpy.
a = df.to_numpy() # Dataframe of shape (10,500)
b = np.array(your_list) # your_list contains 10 random numbers >=1 and <=10000
res = pd.DataFrame(a > b[:,None], index= df.index, columns=df.columns)
Lets explain using a smaller dataframe having 3 lines and 5 columns and a list having 3 numbers. All numbers are random between 1-9.
inter = np.array([[1,2,3,5],[4,5,6,1],[7,8,9,5]])
df = pd.DataFrame(inter)
your_list = [3,6,7]
The output obtained after applying above code is:
0 1 2 3
0 False False False True
1 False False False False
2 False True True False
Hence solution is correct.