python- How do I get a header(or column name) from data?-CodePudding

PICTURE I want to get a column name when the data is under 10. I can get or connect to the data using loc or iloc but I couldn't find a function or something to get a column name.

ex) if date of 220609 , MANGO = 7 and 220610, Mango=2, then I need to do like this;

if df.iloc[0,1:]<10==True:

then I want to get 220609 and 220610

I dont know how I make a code for this in order to get a column name for each items.

what should I do? Thank you!

CodePudding user response：

This is the general idea. Despite the usual cautions against iterating rows, in this case you MUST do that, since your results will have different lengths for the different rows.

import pandas as pd

data = [
    [ 'APPLE', 10, 10,8, 5 ],
    [ 'BANANA', 3,10, 2, 0 ],
    [ 'KIWI', 10,4, 10,2 ],
    [ 'MELON', 10, 10, 3, 10 ],
    [ 'MANGO', 7, 2, 10, 10 ]
]

df = pd.DataFrame(data, columns=['FRUIT',220609,220610,220611,220612])
df.set_index('FRUIT',inplace=True)
print(df)

for fruit,row in df.iterrows():
    print(fruit, df.columns[row<10])

Output:

        220609  220610  220611  220612
FRUIT                                 
APPLE       10      10       8       5
BANANA       3      10       2       0
KIWI        10       4      10       2
MELON       10      10       3      10
MANGO        7       2      10      10
APPLE [220611, 220612]
BANANA [220609, 220611, 220612]
KIWI [220610, 220612]
MELON [220611]
MANGO [220609, 220610]

CodePudding user response：

This approach uses indexing into the columns index.

Using your example

dfx.columns[1:][dfx.iloc[0,1:].lt(10)].to_list()

Result

['220611', '220612']

CodePudding user response：

You can use df.melt with condition

df[df<10].melt(ignore_index=False).dropna().sort_values(by='FRUIT')

       variable  value
FRUIT
APPLE    220611    8.0
APPLE    220612    5.0
BANANA   220609    3.0
BANANA   220611    2.0
BANANA   220612    0.0
KIWI     220610    4.0
KIWI     220612    2.0
MANGO    220609    7.0
MANGO    220610    2.0
MELON    220611    3.0

After that, if you want to choose your data based on the desired fruit, you can use df.loc.

ddf.loc[["APPLE"]]["variable"].to_list()

[220611, 220612]