Existing Dataframe :
Id col_1 col_2 col_3 col_4
A 3 6 6 2
A 3 6 6 5
A 3 6 6 4
B 2 4 4 6
B 2 4 4 6
Expected Dataframe :
Id col_1 col_2 col_3
A 3 6 6
B 2 4 4
I am trying to find first Appearance of the value from the selective columns.
I know with new_df= df.groupby('Id')['col_1'].first().reset_index()
we can get the first value , but is there a way to get first value for multiple column(Required column) at once
CodePudding user response:
For aggregation add list after groupby
:
cols = ['col_1','col_2','col_3']
new_df = df.groupby('Id', as_index=False, sort=False)[cols].first()
print (new_df)
Id col_1 col_2 col_3
0 A 3 6 6
1 B 2 4 4
Or solution without groupby
with DataFrame.drop_duplicates
and select columns by names (added Id
column):
cols = ['Id','col_1','col_2','col_3']
new_df = df.drop_duplicates('Id')[cols]
print (new_df)
Id col_1 col_2 col_3
0 A 3 6 6
3 B 2 4 4