How to add a row cell value based on other rows cell values in Pandas?-CodePudding

I have a pandas dataframe like below

Id	Date	Aid
0	2022-01-01	one
1	2022-01-01	two
2	2022-01-05	one
3	2022-01-06	three
4	2022-01-02	one
5	2022-01-01	one

I would like to add a column "counter" which, for each row, will contain the number of rows with the same "Aid" and "Date" lower or equal with the row's "Date"

So, for row 4, 'Counter' = 3 (rows 0, 4, 5).

The result should be:

Id	Date	Aid	Counter
0	2022-01-01	one	2
1	2022-01-01	two	1
2	2022-01-05	one	4
3	2022-01-06	three	1
4	2022-01-02	one	3
5	2022-01-01	one	2

Is there a way to do this count in pandas ?

The original dataframe contains milions of rows so efficiency is very important

CodePudding user response：

rank with max method

pd.to_datetime(df['Date']).groupby(df['Aid']).rank(method='max').astype('int')

output:

0    2
1    1
2    4
3    1
4    3
5    2
Name: Date, dtype: int32

make output to Counter column

df.assign(Counter=pd.to_datetime(df['Date']).groupby(df['Aid']).rank(method='max').astype('int'))

result:

Id  Date        Aid     Counter
0   2022-01-01  one     2
1   2022-01-01  two     1
2   2022-01-05  one     4
3   2022-01-06  three   1
4   2022-01-02  one     3
5   2022-01-01  one     2