I have a dataframe df created as,
df = pd.DataFrame([
[0, 1, 0],
[1, 0, 0],
[0, 1, 0]],
columns=['Mon','Tue','Wed'])
df:
Mon Tue Wed
0 0 1 0
1 1 0 0
2 0 1 0
For this dataframe, I want to replace the element '1'- to the respective column name and '0' to be replaced with- "", and finally combine all of them.
1 and 0 can also be characters- 'X' and '-'.
Desired Output:
Day
0 Tue
1 Mon
2 Tue
CodePudding user response:
without duplicate values (i.e. only one 1
per row)
Use idxmax
on axis=1
:
df.idxmax(axis=1)
NB. if you have '-'/'X' as values, use df.eq('X').idxmax(axis=1)
output:
0 Tue
1 Mon
2 Tue
As dataframe:
df.idxmax(axis=1).rename('Day').to_frame()
output:
Day
0 Tue
1 Mon
2 Tue
potential duplicates
You can use melt
:
df = pd.DataFrame([
[0, 1, 1],
[1, 0, 0],
[0, 1, 0]],
columns=['Mon','Tue','Wed'])
# below use 'value == "X"' if '-'/'X' values
df.melt(var_name='Day', ignore_index=False).query('value == 1')[['Day']]
output:
Day
1 Mon
0 Tue
2 Tue
0 Wed
CodePudding user response:
Use DataFrame.dot
with columns names separated by comma if possible duplicated 1
per rows:
df1 = df.dot(df.columns ',').str[:-1].to_frame('Day')
print (df1)
Day
0 Tue
1 Mon
2 Tue
If multiple values:
df = pd.DataFrame([
[0, 1, 1],
[1, 0, 0],
[0, 1, 0]],
columns=['Mon','Tue','Wed'])
print (df)
Mon Tue Wed
0 0 1 1
1 1 0 0
2 0 1 0
df1 = df.dot(df.columns ',').str[:-1].to_frame('Day')
print (df1)
Day
0 Tue,Wed
1 Mon
2 Tue