can someone help with the error with converting a json file to a data frame pls
I'm trying to convert the JSON text file to a data frame but get the array same length error. I have tried double [[]] around the 'data' but still doesn't work. The text file is at https://stackoverflowtez.filecloudonline.com/ui/core/index.html?mode=single&path=/SHARED/!2CkRNC5x55IO6kGJjEwTViZ4mGmwG/9aINFGD2QxaELHFL&shareto=#expl-tabl.
df_matchDetails = pd.DataFrame(responsematchDetails.json()['data'])
ValueError: All arrays must be of the same length
A portion of JSON pasted below, the whole output is in the convert.txt at the link
"success": true, "pager": {"current_page": 1, "max_page": 1, "results_per_page": 600, "total_results": 380}, "metadata": {"request_limit": "3600", "request_remaining": "3597", "request_reset_message": "Request limit is refreshed every hour."}, "data": [{"id": 1308266, "homeID": 218, "awayID": 59, "season": "2021/2022", "status": "complete", "roundID": 72035, "game_week": 1, "revised_game_week": -1, "homeGoals": ["22", "73"], "awayGoals": [], "homeGoalCount": 2, "awayGoalCount": 0, "totalGoalCount": 2, "team_a_corners": 2, "team_b_corners": 5, "totalCornerCount": 7, "team_a_offsides": 1, "team_b_offsides": 1, "team_a_yellow_cards": 0, "team_b_yellow_cards": 0, "team_a_red_cards": 0, "team_b_red_cards": 0, "team_a_shotsOnTarget": 4, "team_b_shotsOnTarget": 5, "team_a_shotsOffTarget": 5, "team_b_shotsOffTarget": 18, "team_a_shots": 9, "team_b_shots": 23, "team_a_fouls": 12, "team_b_fouls": 8, "team_a_possession": 35, "team_b_possession": 65, "refereeID": 393, "coach_a_ID": 5354, "coach_b_ID": 33237, "stadium_name": "Brentford Community Stadium (Brentford, Middlesex)", "stadium_location": "", "team_a_cards_num": 0, "team_b_cards_num": 0, "odds_ft_1": 3.9, "odds_ft_x": 3.4, "odds_ft_2": 2.05, "odds_ft_over05": 1.11, "odds_ft_over15": 1.43, "odds_ft_over25": 2.2, "odds_ft_over35": 3.75, "odds_ft_over45": 7.25, "odds_ft_under05": 8.25, "odds_ft_under15": 3, "odds_ft_under25": 1.77, "odds_ft_under35": 1.31, "odds_ft_under45": 1.12, "odds_btts_yes": 1.95, "odds_btts_no": 2, "odds_team_a_cs_yes": 4.85, "odds_team_a_cs_no": 1.22, "odds_team_b_cs_yes": 2.7, "odds_team_b_cs_no": 1.53, "odds_doublechance_1x": 1.95, "odds_doublechance_12": 1.2, "odds_doublechance_x2": 1.22, "odds_1st_half_result_1": 4.33, "odds_1st_half_result_x": 2.25, "odds_1st_half_result_2": 2.5,
Thanks
CodePudding user response:
The problem comes from the nested parts of your JSON file. You can use the json_normalize()
function from Pandas to solve your issue :
import pandas as pd
import json
with open("convert.txt",'r') as f:
data = json.loads(f.read())
df = pd.json_normalize(data, record_path =['data'])