I have dataframe with 2 columns Name and Signal. I want to fill nan values in Signal column but it should be done according to its Name. I want to imputate it with Most frequent value according to its Name. For example:
Timestamp Name Signal
2021-01-01 A. On
2021-01-02. A nan
2021-01-03. A. On
2021-01-01. B. Off
2021-01-02. B. Off
2021-01-03. B. nan
For name A nan value of Signal column should be imputated with "On" since it is most frequent value but for Name B it should be filled with Off because it is the most frequent for B.
How can I achieve it?
CodePudding user response:
df = df.groupby('Name').apply(lambda x: x.fillna(x['Signal'].value_counts().index[0]))
Output:
>>> df
Timestamp Name Signal
0 2021-01-01 A On
1 2021-01-02 A On
2 2021-01-03 A On
3 2021-01-01 B Off
4 2021-01-02 B Off
5 2021-01-03 B Off