I need to fill null values in the column with not null value of the same group.
I tried using transform with mode, but it didn't do the job.
test['col2']=test['col2'].transform(lambda x:x.fillna(x.mode())
CodePudding user response:
Use GroupBy.transform
with mode
and select first value if exist, else None
, last pass to Series.fillna
:
s = df.groupby('col1')['col2'].transform(lambda x: next(iter(x.mode()), None))
df['col2'] = df['col2'].fillna(s)
print (df)
col1 col2
0 gr1 test1
1 gr2 test2
2 gr1 test1
3 gr1 test1
4 gr2 test2
5 gr3 test3
6 gr2 test2
CodePudding user response:
I would use .assign
and .apply
to go through each row and then find the mode:
import pandas
import numpy
df = pandas.DataFrame({
'col1':['gr1', 'gr2', 'gr1', 'gr1', 'gr2', 'gr3', 'gr2', numpy.nan],
'col2':['test1', 'test2', 'test', numpy.nan, numpy.nan, 'test3', numpy.nan, numpy.nan],
})
def fill_value(x):
if x['col2'] is numpy.nan:
mode = df.loc[df['col1'] == x['col1'], 'col2'].mode()
default = numpy.nan
return mode.iloc[0] if not mode.empty else default
else:
return x['col2']
df = df.assign(col2=df.apply(fill_value, axis=1))
output:
col1 col2
0 gr1 test1
1 gr2 test2
2 gr1 test
3 gr1 test
4 gr2 test2
5 gr3 test3
6 gr2 test2
7 NaN NaN