I have the following dataframe:
id phone email
10352897
10352897 10225967
10352897 [email protected]
10352897 10225967 [email protected]
10225967
10225967 [email protected]
[email protected]
23578910
23578910 38256789
23578910 [email protected]
23578910 38256789 [email protected]
38256789
38256789 [email protected]
[email protected]
I would like that when it coincides in at least one of the three columns, the record joins and prioritizes the filled fields over the empty ones, in the end in this example I would expect the following output:
id phone email
10352897 10225967 [email protected]
23578910 38256789 [email protected]
How would you go about doing this?
CodePudding user response:
Use pandas.core.groupby.DataFrameGroupBy.aggregate
as follows
df_new = df.groupby('id').agg(lambda x: x.value_counts().index[0]).reset_index()
[Out]:
id phone email
0 10352897 10225967 [email protected]
1 23578910 38256789 [email protected]