I have one main DataFrame titled nflLineups.
I am looking to merge 3 more DataFrames: dfPass, dfRush, dfReceive with the first DF, nflLineups.
So far nothing I have tried has worked. Tried appending, concatenation, and merging -- merge how='left', how='outer', on = 'Name', etc.
My goal is to have one large output that merges the data on Name but maintains all of the columns and their respective values.
The main output should have the following columns: Name, Team, Position, passYrds, rushYrds, recYrds. I would just like the stat data (pass, rush, rec) to fill in their respective rows next to the player's name in nflLineups. Not every player has data in every category so those values should be left blank (n/a).
I see that there are some merging examples on Stack but have yet to find code that I can use successfully. Spent the last 2 days messing with this and could use some help if possible. Still learning how to merge data and consider myself a relatively new with Python.
Any help would be greatly appreciated.
Here is my code so far:
import pandas as pd
nflLineups = pd.DataFrame([{'Name': 'Teddy', 'Team': 'DEN', 'Position': 'QB'},
{'Name': 'Melvin', 'Team': 'DEN', 'Position': 'RB'},
{'Name': 'Courtland', 'Team': 'DEN', 'Position': 'WR'},
{'Name': 'Tim', 'Team': 'DEN', 'Position': 'WR'},
{'Name': 'Kendal', 'Team': 'DEN', 'Position': 'WR'},
{'Name': 'Noah', 'Team': 'DEN', 'Position': 'TE'},
{'Name': 'Case', 'Team': 'CLE', 'Position': 'QB'},
{'Name': 'D Ernest', 'Team': 'CLE', 'Position': 'RB'},
{'Name': 'Odell', 'Team': 'CLE', 'Position': 'WR'},
{'Name': 'Jarvis', 'Team': 'CLE', 'Position': 'WR'},
{'Name': 'Donovan', 'Team': 'CLE', 'Position': 'WR'},
{'Name': 'Austin', 'Team': 'CLE', 'Position': 'TE'},])
dfPass = pd.DataFrame([{'Name': 'Teddy', 'Team': 'DEN', 'Position': 'QB', 'passYrds': 1500},
{'Name': 'Case', 'Team': 'CLE', 'Position': 'QB', 'passYrds': 1350}])
dfRun = pd.DataFrame([{'Name': 'Teddy', 'Team': 'DEN', 'Position': 'QB', 'rushYrds': 45},
{'Name': 'D Ernest', 'Team': 'CLE', 'Position': 'RB', 'rushYrds': 350}])
dfReceive = pd.DataFrame([{'Name': 'D Ernest', 'Team': 'CLE', 'Position': 'RB', 'recYrds': 68},
{'Name': 'Jarvis', 'Team': 'CLE', 'Position': 'WR', 'recYrds': 250}])
CodePudding user response:
Or with merge:
df_main = nflLineups.merge(dfPass, how='left', on=['Name', 'Team', 'Position']).merge(dfRun, how='left', on=['Name', 'Team', 'Position']).merge(dfReceive, how='left', on=['Name', 'Team', 'Position'])
Output:
Name Team Position passYrds rushYrds recYrds
0 Teddy DEN QB 1500.0 45.0 NaN
1 Melvin DEN RB NaN NaN NaN
2 Courtland DEN WR NaN NaN NaN
3 Tim DEN WR NaN NaN NaN
4 Kendal DEN WR NaN NaN NaN
5 Noah DEN TE NaN NaN NaN
6 Case CLE QB 1350.0 NaN NaN
7 D Ernest CLE RB NaN 350.0 68.0
8 Odell CLE WR NaN NaN NaN
9 Jarvis CLE WR NaN NaN 250.0
10 Donovan CLE WR NaN NaN NaN
11 Austin CLE TE NaN NaN NaN
CodePudding user response:
IIUC, one way using pandas.DataFrame.groupby.first
after pandas.concat
.
Note that I assumed Team
and Position
are same for each Name
.
df = pd.concat([nflLineups, dfPass, dfRun, dfReceive])
df = df.groupby("Name", sort=False).first()
Output:
Team Position passYrds rushYrds recYrds
Name
Teddy DEN QB 1500.0 45.0 NaN
Melvin DEN RB NaN NaN NaN
Courtland DEN WR NaN NaN NaN
Tim DEN WR NaN NaN NaN
Kendal DEN WR NaN NaN NaN
Noah DEN TE NaN NaN NaN
Case CLE QB 1350.0 NaN NaN
D Ernest CLE RB NaN 350.0 68.0
Odell CLE WR NaN NaN NaN
Jarvis CLE WR NaN NaN 250.0
Donovan CLE WR NaN NaN NaN
Austin CLE TE NaN NaN NaN