Home > OS >  pandas groupby, then aggregate by a 2nd column and find corresponding value in a 3rd column
pandas groupby, then aggregate by a 2nd column and find corresponding value in a 3rd column

Time:07-30

I have a table with 3 main columns. I would like to first group the data by Company ID, then get the Highest Post Valuation per Company ID, and its corresponding Deal Date.

Question: How do I add corresponding Deal Date in?

The data:

Company ID Post Valuation Deal Date
60 119616-85 NaN 2022-03-01
80 160988-50 6.77 2022-02-10
85 108827-47 NaN 2022-02-01
89 154876-33 1.40 2022-01-27
104 435509-92 6.16 2022-01-05
107 186777-73 17.26 2022-01-03
111 232001-47 NaN 2022-01-01
113 160988-50 NaN 2021-12-31
119 114196-78 NaN 2021-12-15
128 481375-00 2.82 2021-12-01
130 128348-20 NaN 2021-11-25
131 166855-60 658.36 2021-11-25
150 113503-87 NaN 2021-10-20
156 178448-68 21.75 2021-10-07
170 479007-64 NaN 2021-09-13
182 128479-51 NaN 2021-09-01
185 113503-87 NaN 2021-08-31
186 128348-20 NaN 2021-08-30
191 108643-42 8.02 2021-08-13
192 186272-74 NaN 2021-08-12

The attempt

df_X.sort_values('Post Valuation', ascending=True).groupby('Company ID', as_index=False)['Post Valuation'].first()

CodePudding user response:

Sort and drop duplicates:

result = df.sort_values('Post Valuation').drop_duplicates(subset='Company ID', keep='last')
  • Related