Table :-
I/P
user_id | channel
123. | online
123. | Null
123 |
Group by user_id and max of channel
O/P
user_id | channel
123. | online
in SQL it is done using this query
select user_id,max(channel) from table_name group by user_id
the same functionality I need in pandas Dataframe
CodePudding user response:
Gedas Miksenas is right... But in your data there is blank & null too, so might through error, it's better to first convert the channel column to str then do groupby... like this...
df = pd.DataFrame({"user_id":[123,123,123],
"channel":["online",np.nan,""]})
df['channel'] = df['channel'].astype(str)
df1 = df.groupby('user_id')['channel'].max().reset_index()
# Output
user_id channel
0 123 online