Populate a column with the history of previous variables (Python)-CodePudding

I am currently taking a course in Python and for a project I am looking to do something specific without knowing how to go about it.

I have a dataset that contains some information about tennis matches. For each match I have among other things the date and the couple of players who play the match.

I would like to create a new column 'Match History' which would look for each previous row if a match has already taken place between the two players.

I have tried to find a way to do this without having to write repeated loops which would not be optimal but I am stuck.

Is there a function, or a library in particular that would make this task easier?

Thanks in advance !

CodePudding user response：

You can just use groupby and use the rank. If it is greater than 0 it is not the first match. This would also let you see how many matches occurred between the same players.

df['Match History'] = df.groupby(['player1','player2']).cumcount()