Home > database >  Pandas: Fillna in the column with the value of the same group
Pandas: Fillna in the column with the value of the same group

Time:06-24

I need to fill null values in the column with not null value of the same group.

Example

Desired Outcome

I tried using transform with mode, but it didn't do the job.

test['col2']=test['col2'].transform(lambda x:x.fillna(x.mode())

CodePudding user response:

Use GroupBy.transform with mode and select first value if exist, else None, last pass to Series.fillna:

s = df.groupby('col1')['col2'].transform(lambda x: next(iter(x.mode()), None))
df['col2'] = df['col2'].fillna(s)
print (df)
  col1   col2
0  gr1  test1
1  gr2  test2
2  gr1  test1
3  gr1  test1
4  gr2  test2
5  gr3  test3
6  gr2  test2

CodePudding user response:

I would use .assign and .apply to go through each row and then find the mode:

import pandas
import numpy

df = pandas.DataFrame({
    'col1':['gr1', 'gr2', 'gr1', 'gr1', 'gr2', 'gr3', 'gr2', numpy.nan], 
    'col2':['test1', 'test2', 'test', numpy.nan, numpy.nan, 'test3', numpy.nan, numpy.nan],
})

def fill_value(x):
    if x['col2'] is numpy.nan:
        mode = df.loc[df['col1'] == x['col1'], 'col2'].mode()
        default = numpy.nan
        return mode.iloc[0] if not mode.empty else default
    else:
        return x['col2']
    
df = df.assign(col2=df.apply(fill_value, axis=1))

output:

  col1   col2
0  gr1  test1
1  gr2  test2
2  gr1   test
3  gr1   test
4  gr2  test2
5  gr3  test3
6  gr2  test2
7  NaN    NaN
  • Related