I have a df with 15 columns: df.columns:
0 class
1 name
2 location
3 income
4 edu_level
--
14 marital_status
after some transformations I got an numpy.ndarray
with shape (15,3) named loads
:
0.52 0.33 0.09
0.20 0.53 0.23
0.60 0.28 0.23
0.13 0.45 0.41
0.49 0.9
so on so on so on
So, 3 columns with 15 values.
What I need to do:
I want to get the df column
name of the values from the first column of loads
that are greater then .50
For this example, the columns of df related to the first column of loads
with values higher than 0.5 should return:
0 Class
2 Location
Same for the second column of loads
, should return:
1 name
3 income
4 edu_level
and the same logic to the 3rd column of loads
.
I managed to get the numparray loads
they way I need it but I am having a bad time with this last part. I know I can simple manually pick the columns but this will be a hard task when df has more than 15 features.
Can anyone help me, please?
CodePudding user response:
given your threshold you can create a boolean array in order to filter df.columns:
threshold = .5
for j in range(loads.shape[1]):
print(df.columms[loads[:,j]>threshold])