Home > Blockchain >  Python Pandas finds cumulative Min per group
Python Pandas finds cumulative Min per group

Time:09-17

I tried to find the Desired_Output column, which is defined as follow: For each Name and Subj group, find the Min of all previous Score.

Name    Date    Subj    Score   Desired_Output
A   2022-05-11  1200    70.88   69.60
A   2022-03-20  1200    69.96   69.60
A   2022-02-23  1200    69.60   69.63
A   2022-01-26  1200    69.63   70.22
A   2022-01-05  1200    70.35   70.22
A   2021-12-08  1200    70.22   70.69
A   2021-11-17  1000    56.73   null
A   2021-11-10  1200    70.69   null
B   2022-05-07  1600    96.16   96.53
B   2022-04-24  1600    94.53   null
B   2022-03-20  2000    124.60  null
B   2022-02-27  1800    109.16  null
B   2022-02-03  1400    82.54   null

Here is the dataset:

pd.DataFrame({
 'Name': ['A','A','A','A','A','A','A','A','B','B','B','B','B'],
 'Date': ['2022-05-11','2022-03-20','2022-02-23','2022-01-26','2022-01-05','2021-12-08','2021-11-17','2021-11-10','2022-05-07','2022-04-24','2022-03-20','2022-02-27','2022-02-03'],
 'Subj': [1200,1200,1200,1200,1200,1200,1000,1200,1600,1600,2000,1800,1400],
 'Score': [70.88,69.96,69.6,69.63,70.35,70.22,56.73,70.69,96.16,94.53,124.6,109.16,82.54]})

I don't know how to achieve that in Pandas, especially without looping the DataFrame.

CodePudding user response:

Assuming the dates are sorted in reverse order, you can use a reversed cummin shift per group:

df['Desired'] = (df[::-1]
                 .groupby(['Name', 'Subj'])['Score']
                 .apply(lambda s: s.cummin().shift())
                 )

Output:

   Name        Date  Subj   Score  Desired
0     A  2022-05-11  1200   70.88    69.60
1     A  2022-03-20  1200   69.96    69.60
2     A  2022-02-23  1200   69.60    69.63
3     A  2022-01-26  1200   69.63    70.22
4     A  2022-01-05  1200   70.35    70.22
5     A  2021-12-08  1200   70.22    70.69
6     A  2021-11-17  1000   56.73      NaN
7     A  2021-11-10  1200   70.69      NaN
8     B  2022-05-07  1600   96.16    94.53
9     B  2022-04-24  1600   94.53      NaN
10    B  2022-03-20  2000  124.60      NaN
11    B  2022-02-27  1800  109.16      NaN
12    B  2022-02-03  1400   82.54      NaN
  • Related