I need to fill the null values in City
column based on Country
column. If the country is same then it should fill the NaN
value with the frequent city name corresponding to the country.
Input:
ID City Country
0 New York USA
1 NaN USA
2 London UK
3 Mumbai IND
4 Sydney AUS
5 NaN AUS
6 Sydney AUS
7 Brisbane AUS
Output:
ID City Country
0 New York USA
1 New York USA
2 London UK
3 Mumbai IND
4 Sydney AUS
5 Sydney AUS
6 Sydney AUS
7 Brisbane AUS
CodePudding user response:
Idea is replace possible empty strings to NaN
s and then replace values of group by first non NaN
s values:
df['City'] = (df.groupby('Country')['City']
.transform('first'))
Or forward and back filling missing values:
df['City'] = (df.groupby('Country')['City']
.transform(lambda x: x.ffill().bfill()))