Lets say I have a table of house cleaning service like this.
| Customer| House Address | Date |
| ------- | ------------- | -------- |
| Sam | London | 10/01/22 |
| Lina | Manchester | 12/01/22 |
| Sam | Null | 15/01/22 |
We know that Sam house address should be London (assume that the customer id is the same).
How can I fill the third row based on the first row?
Data:
{'Customer': ['Sam', 'Lina', 'Sam'],
'House Address': ['London', 'Manchester', nan],
'Date': ['10/01/22', '12/01/22', '15/01/22']}
CodePudding user response:
You could groupby
"Customer" and transform first
for "House Address" (first
drops NaN values so only London will be selected for Sam). It returns a DataFrame having the same indexes as the original df
filled with the transformed firsts.
Then pass this to fillna
to fill NaN values in "House Address":
df['House Address'] = df['House Address'].fillna(df.groupby('Customer')['House Address'].transform('first'))
Output:
Customer House Address Date
0 Sam London 10/01/22
1 Lina Sydney 12/01/22
2 Sam London 15/01/22