I have similar df like below,
df = pd.DataFrame({'DRINKS':['WHISKEY','VODKA','WATER'],
'STRONG':[5,5,0],
'SOUR':[5,4,0]})
And I want to transform it to this one (Read 5s from the dataframe and when it matches, create a column with whatever name(I named it Cat1) and get the column name(STRONG) where the value was 5, then move on tho the next column and do the same operation until there no columns with rows with a value 5. The final outcome should be like below:
df = pd.DataFrame({'DRINKS':['WHISKEY','VODKA','WATER'],
'Cat1':["STRONG","STRONG",np.nan],
'Cat2':["SOUR",np.nan,np.nan]})
I tried to do it with
df['Cat1']=(df == 5).idxmax(axis=1)
but it gives me only 1 column name for Whiskey.
Any help will be appreciated
CodePudding user response:
try:
df['Cat1'] = np.where(df[df.columns[1]].eq(5), df.columns[1], np.nan) #or df['Cat1'] = np.where(df["STRONG"].eq(5), "STRONG", np.nan)
df['Cat2'] = np.where(df[df.columns[2]].eq(5), df.columns[2], np.nan) #or df['Cat2'] = np.where(df["SOUR"].eq(5), "SOUR", np.nan)
DRINKS STRONG SOUR Cat1 Cat2
0 WHISKEY 5 5 STRONG SOUR
1 VODKA 5 4 STRONG NaN
2 WATER 0 0 NaN NaN
df = df.drop(columns=['STRONG', 'SOUR'])
DRINKS Cat1 Cat2
0 WHISKEY STRONG SOUR
1 VODKA STRONG NaN
2 WATER NaN NaN
CodePudding user response:
You could map each column value of 5 to the column header. The core part of that would be:
df.iloc[:,1:].apply(lambda x: x.map({5:x.name}))
Which delivers:
STRONG SOUR
0 STRONG SOUR
1 STRONG NaN
2 NaN NaN
Then you could put it all together with a column rename:
dfo = (
pd.concat([df['DRINKS'],df.iloc[:,1:].apply(lambda x: x.map({5:x.name}))
.rename(columns=lambda x: 'Cat_' x)], axis=1)
)
print(dfo)
Result
DRINKS Cat_STRONG Cat_SOUR
0 WHISKEY STRONG SOUR
1 VODKA STRONG NaN
2 WATER NaN NaN