I have a pandas dataframe with several columns. The data type of 2 columns (including 'report_end_d' and 'report_start_d') is date. I want to basically get the difference between that two columns and find the maximum length. (That I can do) Also, I want to find the values for that maximum value. I mean dates for report_end_d, report_start_d columns. If anyone can give a suggestion that would be a great help! Thanks in advance!
# here are my dataframe
report_sales report_end_d report_start_d
0 342 2021-09-04 00:00:00 2021-06-13 00:00:0
1 231 2021-08-29 00:00:00 2021-05-23 00:00:00
2 124 2021-09-04 00:00:00 2021-07-11 00:00:00
3 56 2021-09-04 00:00:00 2021-07-25 00:00:00
4 76 2021-08-28 00:00:00 2021-05-22 00:00:00
dss['length'] = (dss['report_end_d'] - dss['report_start_d'])
report_sales report_end_d report_start_d length
0 342 2021-09-04 00:00:00 2021-06-13 00:00:00 83 days
1 231 2021-08-29 00:00:00 2021-05-23 00:00:00 98 days
2 124 2021-09-04 00:00:00 2021-07-11 00:00:00 55 days
3 56 2021-09-04 00:00:00 2021-07-25 00:00:00 41 days
4 76 2021-08-28 00:00:00 2021-05-22 00:00:00 98 days
So I basically need either index 1 values (report_end_d as - 2021-08-29 00:00:00 report_start_d as- 2021-05-23 00:00:00) or index 4 values (report_end_d as -2021-08-28 00:00:00 report_start_d as- 2021-05-22 00:00:00)
Thanks in advance!
CodePudding user response:
largest = dss['length'].max()
dss[['report_end_d','report_start_d']] [dss['length'] == largest]