PICTURE I want to get a column name when the data is under 10. I can get or connect to the data using loc or iloc but I couldn't find a function or something to get a column name.
ex) if date of 220609
, MANGO = 7
and 220610
, Mango=2
, then I need to do like this;
if df.iloc[0,1:]<10==True:
then I want to get 220609 and 220610
I dont know how I make a code for this in order to get a column name for each items.
what should I do? Thank you!
CodePudding user response:
This is the general idea. Despite the usual cautions against iterating rows, in this case you MUST do that, since your results will have different lengths for the different rows.
import pandas as pd
data = [
[ 'APPLE', 10, 10,8, 5 ],
[ 'BANANA', 3,10, 2, 0 ],
[ 'KIWI', 10,4, 10,2 ],
[ 'MELON', 10, 10, 3, 10 ],
[ 'MANGO', 7, 2, 10, 10 ]
]
df = pd.DataFrame(data, columns=['FRUIT',220609,220610,220611,220612])
df.set_index('FRUIT',inplace=True)
print(df)
for fruit,row in df.iterrows():
print(fruit, df.columns[row<10])
Output:
220609 220610 220611 220612
FRUIT
APPLE 10 10 8 5
BANANA 3 10 2 0
KIWI 10 4 10 2
MELON 10 10 3 10
MANGO 7 2 10 10
APPLE [220611, 220612]
BANANA [220609, 220611, 220612]
KIWI [220610, 220612]
MELON [220611]
MANGO [220609, 220610]
CodePudding user response:
This approach uses indexing into the columns index.
Using your example
dfx.columns[1:][dfx.iloc[0,1:].lt(10)].to_list()
Result
['220611', '220612']
CodePudding user response:
You can use df.melt
with condition
df[df<10].melt(ignore_index=False).dropna().sort_values(by='FRUIT')
variable value
FRUIT
APPLE 220611 8.0
APPLE 220612 5.0
BANANA 220609 3.0
BANANA 220611 2.0
BANANA 220612 0.0
KIWI 220610 4.0
KIWI 220612 2.0
MANGO 220609 7.0
MANGO 220610 2.0
MELON 220611 3.0
After that, if you want to choose your data based on the desired fruit, you can use df.loc
.
ddf.loc[["APPLE"]]["variable"].to_list()
[220611, 220612]