Home > Back-end >  How to parse a nested dictionary in pandas/python - baseball API
How to parse a nested dictionary in pandas/python - baseball API

Time:08-11

Hi I have the below API response which is a part very long JSON string. Im trying to parse the "away batting totals" so that i am able to pull the info into a data frame. Im using the function statsapi.boxscore_data(662647) to pull the full json response

awayBattingTotals': {'namefield': 'Totals',   'ab': '33',   'r': '2',   'h': '7',   'hr': '1',   'rbi': '2',   'bb': '0',   'k': '8',   'lob': '13',    From the full json response which is displayed below;

The full json string is super long.. below is a snippet. Im trying to use the code below to pull the info, but i've been unsuccessful

enter image description here

statsapi.boxscore_data(662647)

summary = statsapi.boxscore(662647)

result = summary[awayBattingfields]["Totals"]

print(result)

below is a snippet from the response;

  {'namefield': '9 Lopez, N  SS',
   'ab': '3',
   'r': '0',
   'h': '1',
   'doubles': '0',
   'triples': '0',
   'hr': '0',
   'rbi': '0',
   'sb': '0',
   'bb': '0',
   'k': '0',
   'lob': '2',
   'avg': '.248',
   'ops': '.599',
   'personId': 670032,
   'battingOrder': '900',
   'substitution': False,
   'note': '',
   'name': 'Lopez, N',
   'position': 'SS',
   'obp': '.305',
   'slg': '.294'}],
 'awayBattingTotals': {'namefield': 'Totals',
  'ab': '33',
  'r': '2',
  'h': '7',
  'hr': '1',
  'rbi': '2',
  'bb': '0',
  'k': '8',
  'lob': '13',
  'avg': '',
  'ops': '',
  'obp': '',
  'slg': '',
  'name': 'Totals',
  'position': '',
  'note': '',
  'substitution': False,
  'battingOrder': '',
  'personId': 0},
 'homeBattingTotals': {'namefield': 'Totals',
  'ab': '34',
  'r': '4',
  'h': '9',
  'hr': '2',
  'rbi': '4',
  'bb': '1',
  'k': '7',
  'lob': '13',
  'avg': '',
  'ops': '',
  'obp': '',
  'slg': '',
  'name': 'Totals',
  'position': '',
  'note': '',
  'substitution': False,
  'battingOrder': '',
  'personId': 0},
 'awayBattingNotes': {0: 'a-Struck out for Zavala in the 8th.'},
 'homeBattingNotes': {},
 'awayPitchers': [{'namefield': 'White Sox Pitchers',
   'ip': 'IP',
   'h': 'H',
   'r': 'R',
   'er': 'ER',
   'bb': 'BB',
   'k': 'K',
   'hr': 'HR',
   'era': 'ERA',
   'p': 'P',
   's': 'S',
   'name': 'White Sox Pitchers',
   'personId': 0,
   'note': ''},
  {'namefield': 'Lynn  (L, 2-5)',
   'ip': '6.0',
   'h': '7',
   'r': '4',
   'er': '4',
   'bb': '1',
   'k': '5',
   'hr': '2',
   'p': '90',
   's': '59',
   'era': '5.88',
   'name': 'Lynn',
   'personId': 458681,
   'note': '(L, 2-5)'},
  {'namefield': 'Kelly, J',
   'ip': '1.0',
   'h': '0',
   'r': '0',
   'er': '0',
   'bb': '0',
   'k': '1',
   'hr': '0',
   'p': '10',
   's': '7',
   'era': '5.18',
   'name': 'Kelly, J',
   'personId': 523260,
   'note': ''},
  {'namefield': 'Foster',
   'ip': '1.0',
   'h': '2',
   'r': '0',
   'er': '0',
   'bb': '0',
   'k': '1',
   'hr': '0',
   'p': '13',
   's': '10',
   'era': '4.40',
   'name': 'Foster',
   'personId': 641582,
   'note': ''}],
 'homePitchers': [{'namefield': 'Royals Pitchers',
   'ip': 'IP',
   'h': 'H',
   'r': 'R',
   'er': 'ER',
   'bb': 'BB',
   'k': 'K',
   'hr': 'HR',
   'era': 'ERA',
   'p': 'P',
   's': 'S',
   'name': 'White Sox Pitchers',
   'personId': 0,
   'note': ''},
  {'namefield': 'Singer  (W, 5-4)',
   'ip': '7.1',
   'h': '5',
   'r': '1',
   'er': '1',
   'bb': '0',
   'k': '6',
   'hr': '1',
   'p': '99',
   's': '71',
   'era': '3.49',
   'name': 'Singer',
   'personId': 663903,
   'note': '(W, 5-4)'},
  {'namefield': 'Barlow, S  (H, 5)',
   'ip': '0.2',
   'h': '0',
   'r': '0',
   'er': '0',
   'bb': '0',
   'k': '1',
   'hr': '0',
   'p': '11',
   's': '8',
   'era': '2.19',
   'name': 'Barlow, S',
   'personId': 605130,
   'note': '(H, 5)'},
  {'namefield': 'Coleman  (H, 10)',
   'ip': '0.1',
   'h': '2',
   'r': '1',
   'er': '1',
   'bb': '0',
   'k': '0',
   'hr': '0',
   'p': '13',
   's': '8',
   'era': '2.98',
   'name': 'Coleman',
   'personId': 669395,
   'note': '(H, 10)'},
  {'namefield': 'Cuas  (S, 1)',
   'ip': '0.2',
   'h': '0',
   'r': '0',
   'er': '0',
   'bb': '0',
   'k': '1',
   'hr': '0',
   'p': '9',
   's': '6',
   'era': '3.09',
   'name': 'Cuas',
   'personId': 621016,
   'note': '(S, 1)'}],
 'awayPitchingTotals': {'namefield': 'Totals',
  'ip': '8.0',
  'h': '9',
  'r': '4',
  'er': '4',
  'bb': '1',
  'k': '7',
  'hr': '2',
  'p': '',
  's': '',
  'era': '',
  'name': 'Totals',
  'personId': 0,
  'note': ''},
 'homePitchingTotals': {'namefield': 'Totals',
  'ip': '9.0',
  'h': '7',
  'r': '2',
  'er': '2',
  'bb': '0',
  'k': '8',
  'hr': '1',
  'p': '',
  's': '',
  'era': '',
  'name': 'Totals',
  'personId': 0,
  'note': ''},
 'gameBoxInfo': [{'label': 'HBP',
   'value': 'Harrison, J (by Singer); Garcia, Le (by Coleman).'},
  {'label': 'Pitches-strikes',
   'value': 'Lynn 90-59; Kelly, J 10-7; Foster 13-10; Singer 99-71; Barlow, S 11-8; Coleman 13-8; Cuas 9-6.'},
  {'label': 'Groundouts-flyouts',
   'value': 'Lynn 4-5; Kelly, J 1-1; Foster 0-2; Singer 6-5; Barlow, S 0-0; Coleman 0-0; Cuas 1-0.'},
  {'label': 'Batters faced',
   'value': 'Lynn 27; Kelly, J 3; Foster 5; Singer 28; Barlow, S 2; Coleman 4; Cuas 2.'},
  {'label': 'Inherited runners-scored', 'value': 'Barlow, S 2-0; Cuas 2-0.'},
  {'label': 'Umpires',
   'value': 'HP: Jerry Meals. 1B: Clint Vondrak. 2B: Malachi Moore. 3B: Vic Carapazza. '},
  {'label': 'Weather', 'value': '84 degrees, Partly Cloudy.'},
  {'label': 'Wind', 'value': '4 mph, L To R.'},
  {'label': 'First pitch', 'value': '3:10 PM.'},
  {'label': 'T', 'value': '2:36.'},
  {'label': 'Venue', 'value': 'Kauffman Stadium.'},
  {'label': 'August 9, 2022'}]}

CodePudding user response:

You need to write

summary["awayBattingTotals"]["Totals"]

not

summary[awayBattingTotals]["Totals"]

In the first version, you are looking for the key which is the string "awayBattingTotals" in the second version, you are looking for the key which is stored in variable awayBattingTotals. Since that variable has not been defined, it fails.

CodePudding user response:

What are you try to get as your output? It isn't clear at all what you are trying to do here.

import statsapi

summary = statsapi.boxscore_data(662647)
result = summary["awayBattingTotals"]


print(result)
df = pd.DataFrame([result])
print(df)

Output:

 namefield  ab  r  h hr  ... position note substitution battingOrder personId
0    Totals  33  2  7  1  ...                      False                     0

[1 rows x 19 columns]

Or the batters?

result_batters = summary["awayBatters"]

print(result_batters)
df = pd.DataFrame(result_batters)
print(df)

Output:

            namefield  ab  r  h  ... position   obp   slg battingOrder
0   White Sox Batters  AB  R  H  ...            OBP   SLG             
1       1 Pollock  LF   4  0  1  ...       LF  .286  .353          100
2        2 Robert  CF   4  0  1  ...       CF  .334  .453          200
3       3 Jiménez  DH   4  0  1  ...       DH  .318  .455          300
4      4 Abreu, J  1B   4  1  1  ...       1B  .378  .468          400
5        5 Vaughn  RF   4  0  1  ...       RF  .348  .464          500
6       6 Moncada  3B   3  0  0  ...       3B  .258  .311          600
7    7 Garcia, Le  SS   3  0  0  ...       SS  .240  .279          700
8   8 Harrison, J  2B   3  1  1  ...       2B  .312  .385          800
9         9 Zavala  C   2  0  1  ...        C  .309  .395          900
10       a-Sheets  PH   1  0  0  ...       PH  .285  .385          901
11         Grandal  C   1  0  0  ...        C  .288  .242          902

[12 rows x 22 columns]
  • Related