Home > other >  add specific competition to each row in pandas dataframe
add specific competition to each row in pandas dataframe

Time:05-22

I have a dataframe for soccer games, but the competition (tournament) of each game/row is in separate row:

match           date        country
Italy Serie A   NaN         NaN
team1 vs team2  2022-05-30  Italy
team6 vs team2  2022-05-29  Italy
Germany Bun.    NaN         NaN
team3 vs team4  2022-06-01  Germany
team9 vs team10 2022-06-01  Germany

I want to add the competition to each specific row: like below:

competition     match           date        country
Italy Serie A   team1 vs team2  2022-05-30  Italy
Italy Serie A   team6 vs team2  2022-05-29  Italy
Germany Bund    team3 vs team4  2022-06-01  Germany
Germany Bund    team9 vs team10 2022-06-01  Germany

CodePudding user response:

You can use a column that has NaNs (e.g. date) to identify the rows with the competition and use it to ffill the data, then slice the rows:

mask = df['date'].isna()

df2 = (df
 .assign(competition=df['match'].where(mask).ffill())
 .loc[~mask]
)

output:

             match        date  country    competition
1   team1 vs team2  2022-05-30    Italy  Italy Serie A
2   team6 vs team2  2022-05-29    Italy  Italy Serie A
4   team3 vs team4  2022-06-01  Germany   Germany Bun.
5  team9 vs team10  2022-06-01  Germany   Germany Bun.
  • Related