I'm trying to find the most frequent value from each row of a DataFrame. I found the way here to do that. But I'm getting two columns instead of one column after doing that.
What do I want to do?
Let's say I have this DataFrame
In [88]: df
Out[88]:
a b c
0 2 3 3
1 1 1 2
2 7 7 8
and I want this
In [89]: df.mode(axis=1)
Out[89]:
0
0 3
1 1
2 7
I'm trying to apply this in DataFrame but it's not working properly.
My DataFrame looks like.
In [45]: data.head()
a b c d e f
0 1 1 1 1 1 1
1 0 0 0 0 0 0
2 0 0 0 0 1 0
3 0 0 0 0 0 0
4 1 1 1 1 1 1
In [47]: data.shape
Out[48]:(5665, 6)
Getting this output
In [47]: data.mode(axis=1)
Out[48]:
0 1
0 1.0 NaN
1 0.0 NaN
2 0.0 NaN
3 0.0 NaN
4 1.0 NaN
Note: If I apply mode for a few rows data.head().mode(axis=1)
it's working fine, but it's not working for full DataFrame.
CodePudding user response:
Is this what you are trying to do?
df['Mode'] = df.mode(axis='columns', numeric_only=True)
df
CodePudding user response:
A set of values can have more than one mode, e.g. the array [0, 0, 0, 1, 1, 1]
has mode [0, 1]
because both appear equally often. In that case, df.mode
will create a second column. If you only want one of the most common values for each row, you can simply drop the second column in the output dataframe.