I have a dataframe for soccer games, but the competition (tournament) of each game/row is in separate row:
match date country
Italy Serie A NaN NaN
team1 vs team2 2022-05-30 Italy
team6 vs team2 2022-05-29 Italy
Germany Bun. NaN NaN
team3 vs team4 2022-06-01 Germany
team9 vs team10 2022-06-01 Germany
I want to add the competition to each specific row: like below:
competition match date country
Italy Serie A team1 vs team2 2022-05-30 Italy
Italy Serie A team6 vs team2 2022-05-29 Italy
Germany Bund team3 vs team4 2022-06-01 Germany
Germany Bund team9 vs team10 2022-06-01 Germany
CodePudding user response:
You can use a column that has NaNs (e.g. date) to identify the rows with the competition and use it to ffill
the data, then slice the rows:
mask = df['date'].isna()
df2 = (df
.assign(competition=df['match'].where(mask).ffill())
.loc[~mask]
)
output:
match date country competition
1 team1 vs team2 2022-05-30 Italy Italy Serie A
2 team6 vs team2 2022-05-29 Italy Italy Serie A
4 team3 vs team4 2022-06-01 Germany Germany Bun.
5 team9 vs team10 2022-06-01 Germany Germany Bun.