Home > Back-end >  Imputate Nan for categorical data depending on its "Type" column
Imputate Nan for categorical data depending on its "Type" column

Time:11-26

I have dataframe with 2 columns Name and Signal. I want to fill nan values in Signal column but it should be done according to its Name. I want to imputate it with Most frequent value according to its Name. For example:

Timestamp   Name  Signal
 2021-01-01  A.     On
 2021-01-02. A      nan
 2021-01-03. A.     On 
 2021-01-01. B.     Off
 2021-01-02. B.     Off
 2021-01-03. B.     nan

For name A nan value of Signal column should be imputated with "On" since it is most frequent value but for Name B it should be filled with Off because it is the most frequent for B.

How can I achieve it?

CodePudding user response:

df = df.groupby('Name').apply(lambda x: x.fillna(x['Signal'].value_counts().index[0]))

Output:

>>> df
    Timestamp Name Signal
0  2021-01-01    A     On
1  2021-01-02    A     On
2  2021-01-03    A     On
3  2021-01-01    B    Off
4  2021-01-02    B    Off
5  2021-01-03    B    Off
  • Related