Home > database >  How to combine every 2 rows into 1 row
How to combine every 2 rows into 1 row

Time:03-26

I am trying to fix this NBA game detail table. Every two rows are the same game(GAME_ID) in the table. I want the two rows to be in one row so that the home team and the away team's data will be on the same row.

NBA game details

CodePudding user response:

You can use set_index and unstack, then flatten headers.

Here is come code I did for baseball games, I had to create my own Game No:

# source datafrome
year = '2020'
df = pd.read_excel('https://www.sportsbookreviewsonline.com/scoresoddsarchives/mlb/mlb odds 2011.xlsx')
df = df.dropna(how="all").copy()

df["Date"] = df["Date"].astype(int)  # To handle excel float type in some years
df["Date"] = pd.to_datetime(
    f"{year}"   df["Date"].astype(str).str.rjust(4, "0"), format="%Y%m%d"
)

cols = ["Date", "VH", "Team", "Final", "Close"]
df = df[cols]

df = df[df["VH"].isin(["V", "H"])]

# Flatten teams lines in to games
df["Game No"] = (df["VH"] == "V").cumsum()
df = df.set_index(["Game No", "Date", "VH"]).unstack()
df.columns = df.columns.map("_".join)

df = df.reset_index()

Output:

      Game No       Date Team_H Team_V  Final_H  Final_V  Close_H  Close_V
0           1 2020-03-31    WAS    ATL        0        2      119     -139
1           2 2020-03-31    CIN    MIL        7        6     -110     -110
2           3 2020-03-31    STL    SDG        3        5     -178      158
3           4 2020-03-31    LOS    SFO        2        1     -114     -106
4           5 2020-03-31    NYY    DET        6        3     -152      132
...       ...        ...    ...    ...      ...      ...      ...      ...
2462     2463 2020-10-22    TEX    STL        7       16     -183      163
2463     2464 2020-10-23    TEX    STL        4        0     -171      151
2464     2465 2020-10-24    TEX    STL        4        2     -130      110
2465     2466 2020-10-27    STL    TEX       10        9     -114     -106
2466     2467 2020-10-28    STL    TEX        6        2     -130      110

[2467 rows x 8 columns]
  • Related