I have this api link where I want to create a dataframe. so far, I have performed the following steps:
response = requests.get("https://example.com/api")
data = response.json()
df = df.DataFrame(data)
df.head()
took | total | results
3 | 985 | {'AA': '3', 'AC': '1', 'BI': [], 'CY': 'DE', '...}
3 | 985 | {'AA': '8', 'AC': '1', 'BI': [], 'CY': 'DE', '...}
3 | 985 | {'AA': '8', 'AC': '2', 'BI': ['280000'], 'CY':'NL', '...}
However, I'm interested in the content of results. I want it to be like this:
'AA' | 'AC' | 'BI' | 'CY'
'3' | '1' | null | 'DE'
'8' | '1' | null | 'DE'
'8' | '2' | '280000' | 'NL'
Any heads up on how I can proceed?
Thanks in advance.
CodePudding user response:
You might have some luck looking into json_normalize but given the df you already have, you could read the results column in as records, and then fix the BI column.
import pandas as pd
import numpy as np
df = pd.DataFrame({'took': {0: 3, 1: 3, 2: 3},
'total': {0: 985, 1: 985, 2: 985},
'results': {0: {'AA': '3', 'AC': '1', 'BI': [], 'CY': 'DE'},
1: {'AA': '8', 'AC': '1', 'BI': [], 'CY': 'DE'},
2: {'AA': '8', 'AC': '2', 'BI': ['280000'], 'CY':'NL'}}})
df = pd.DataFrame.from_records(df.results.values)
df['BI'] = df['BI'].apply(lambda x: x[0] if x else np.nan)
Output
AA AC BI CY
0 3 1 NaN DE
1 8 1 NaN DE
2 8 2 280000 NL
CodePudding user response:
I'm glad to say that after lots of research, I got to a solution.
response = requests.get("https://example.com/api")
data = response.json()
Exploring the structure of the returned JSON object by looking at
data.keys()
yielded
dict_keys(['took', 'total', 'results'])
Since I am interested in "results", I created my df on the "return" key
df = pd.DataFrame(data["results"])
df.head()
This returned my expected dataframe