Home > database >  How do you get minimum of a group and value corresponding to the minimum row of a column in pandas?
How do you get minimum of a group and value corresponding to the minimum row of a column in pandas?

Time:07-20

How do I get minimum date of a group in pandas and get corresponding value of a column value in different column using pandas?

data = {'date_value': ['2014-06-22', '2014-06-24', '2014-06-25', '2014-06-25', '2014-07-02', '2014-07-06', '2014-07-05', '2014-07-27'],
    'type': ['A', 'A', 'A', 'B', 'B', 'B', 'B', 'C'],
    'sum_col': [1, 2, 3, 5, 4, 3, 2, 1]}
df = pd.DataFrame(data, columns=['date_value', 'type', 'sum_col'])
df['date_value'] = pd.to_datetime(df['date_value'])
df[['min_date','new_sum_col']] = df.groupby('type')['date_value','sum_col'].transform('min')

The result I get is :

  date_value type  sum_col   min_date  new_sum_col
0 2014-06-22    A        1 2014-06-22            1
1 2014-06-24    A        2 2014-06-22            1
2 2014-06-25    A        3 2014-06-22            1
3 2014-06-25    B        5 2014-06-25            2
4 2014-07-02    B        4 2014-06-25            2
5 2014-07-06    B        3 2014-06-25            2
6 2014-07-05    B        2 2014-06-25            2
7 2014-07-27    C        1 2014-07-27            1

Result I want is :

   date_value type  sum_col    min_date  new_sum_col
0  2014-06-22    A        1  2014-06-22            1
1  2014-06-24    A        2  2014-06-22            1
2  2014-06-25    A        3  2014-06-22            1
3  2014-06-25    B        5  2014-06-25            5
4  2014-07-02    B        4  2014-06-25            5
5  2014-07-06    B        3  2014-06-25            5
6  2014-07-05    B        2  2014-06-25            5
7  2014-07-27    C        1  2014-07-27            1

Any help would be appreciated.

Thank you,

Sam

CodePudding user response:

You can use transform('first') after sorting by the date_value to get both column values for the row with the minimum date:

df[['min_date', 'new_sum_col']] = df.sort_values('date_value', ascending=True).groupby('type')['date_value', 'sum_col'].transform('first')

CodePudding user response:

Alternative method using idxmin:

df[['min_date', 'new_sum_col']] = df.loc[df.groupby('type')['date_value'].transform('idxmin'), ['date_value', 'sum_col']].values
  • Related