I have pulled the key-value pairs below from a API as listed below: summary["awayBattingTotals"],summary["homeBattingTotals"],summary["teamInfo"]
response;
{'namefield': 'Totals',
'ab': '33',
'r': '2',
'h': '7',
'hr': '1',
'rbi': '2',
'bb': '0',
'k': '8',
'lob': '13',
'avg': '',
'ops': '',
'obp': '',
'slg': '',
'name': 'Totals',
'position': '',
'note': '',
'substitution': False,
'battingOrder': '',
'personId': 0},
{'namefield': 'Totals',
'ab': '34',
'r': '4',
'h': '9',
'hr': '2',
'rbi': '4',
'bb': '1',
'k': '7',
'lob': '13',
'avg': '',
'ops': '',
'obp': '',
'slg': '',
'name': 'Totals',
'position': '',
'note': '',
'substitution': False,
'battingOrder': '',
'personId': 0},
{'away': {'id': 145,
'abbreviation': 'CWS',
'teamName': 'White Sox',
'shortName': 'Chi White Sox'},
'home': {'id': 118,
'abbreviation': 'KC',
'teamName': 'Royals',
'shortName': 'Kansas City'}})
How can I write this into a data frame using pandas? i tried using
pd.DataFrame.from_dict(summary)
but this gives me the below error
ValueError: All arrays must be of the same length
CodePudding user response:
You are in the right path. This should work:
import pandas as pd
dict = [{'namefield': 'Totals',
'ab': '33',
'r': '2',
'h': '7',
'hr': '1',
'rbi': '2',
'bb': '0',
'k': '8',
'lob': '13',
'avg': '',
'ops': '',
'obp': '',
'slg': '',
'name': 'Totals',
'position': '',
'note': '',
'substitution': False,
'battingOrder': '',
'personId': 0},
{'namefield': 'Totals',
'ab': '34',
'r': '4',
'h': '9',
'hr': '2',
'rbi': '4',
'bb': '1',
'k': '7',
'lob': '13',
'avg': '',
'ops': '',
'obp': '',
'slg': '',
'name': 'Totals',
'position': '',
'note': '',
'substitution': False,
'battingOrder': '',
'personId': 0},
{'away': {'id': 145,
'abbreviation': 'CWS',
'teamName': 'White Sox',
'shortName': 'Chi White Sox'},
'home': {'id': 118,
'abbreviation': 'KC',
'teamName': 'Royals',
'shortName': 'Kansas City'}}]
pd.DataFrame(dict)
CodePudding user response:
The data contained in summary["teamInfo"]
is different from summary["awayBattingTotals"]
and summary["homeBattingTotals"]
. You can seperate them into 2 data group:
batting_totals = [
{'namefield': 'Totals',
'ab': '33',
'r': '2',
'h': '7',
'hr': '1',
'rbi': '2',
'bb': '0',
'k': '8',
'lob': '13',
'avg': '',
'ops': '',
'obp': '',
'slg': '',
'name': 'Totals',
'position': '',
'note': '',
'substitution': False,
'battingOrder': '',
'personId': 0},
{'namefield': 'Totals',
'ab': '34',
'r': '4',
'h': '9',
'hr': '2',
'rbi': '4',
'bb': '1',
'k': '7',
'lob': '13',
'avg': '',
'ops': '',
'obp': '',
'slg': '',
'name': 'Totals',
'position': '',
'note': '',
'substitution': False,
'battingOrder': '',
'personId': 0}
]
batting_df = pd.DataFram.from_dict(batting_totals)
output 1:
namefield ab r h hr rbi bb k lob avg ops obp slg name position note substitution battingOrder personId
0 Totals 33 2 7 1 2 0 8 13 Totals False 0
1 Totals 34 4 9 2 4 1 7 13 Totals False 0
and
team_info =
{'away': {'id': 145,
'abbreviation': 'CWS',
'teamName': 'White Sox',
'shortName': 'Chi White Sox'},
'home': {'id': 118,
'abbreviation': 'KC',
'teamName': 'Royals',
'shortName': 'Kansas City'}
}
team_info_df = pd.DataFrame.from_dict(team_info)
output 2:
away home
id 145 118
abbreviation CWS KC
teamName White Sox Royals
shortName Chi White Sox Kansas City
From these 2 dataframe, you will probably need to find a way to map the game in batting_df
to team_info_df