Home > Software engineering >  creating suitable dataframe from api
creating suitable dataframe from api

Time:11-25

I have this api link where I want to create a dataframe. so far, I have performed the following steps:

response = requests.get("https://example.com/api")

data = response.json()

df = df.DataFrame(data)

df.head()

took   | total   | results
3      |  985    | {'AA': '3', 'AC': '1', 'BI': [], 'CY': 'DE', '...}
3      |  985    | {'AA': '8', 'AC': '1', 'BI': [], 'CY': 'DE', '...}
3      |  985    | {'AA': '8', 'AC': '2', 'BI': ['280000'], 'CY':'NL', '...}

However, I'm interested in the content of results. I want it to be like this:

'AA' | 'AC' | 'BI' | 'CY'
'3'  | '1'  | null | 'DE'
'8'  | '1'  | null | 'DE'
'8'  | '2'  | '280000' | 'NL'

Any heads up on how I can proceed?

Thanks in advance.

CodePudding user response:

You might have some luck looking into json_normalize but given the df you already have, you could read the results column in as records, and then fix the BI column.

import pandas as pd
import numpy as np

df = pd.DataFrame({'took': {0: 3, 1: 3, 2: 3},
 'total': {0: 985, 1: 985, 2: 985},
 'results': {0: {'AA': '3', 'AC': '1', 'BI': [], 'CY': 'DE'},
  1: {'AA': '8', 'AC': '1', 'BI': [], 'CY': 'DE'},
  2: {'AA': '8', 'AC': '2', 'BI': ['280000'], 'CY':'NL'}}})

df = pd.DataFrame.from_records(df.results.values)
df['BI'] = df['BI'].apply(lambda x: x[0] if x else np.nan)

Output

  AA AC      BI  CY
0  3  1     NaN  DE
1  8  1     NaN  DE
2  8  2  280000  NL

CodePudding user response:

I'm glad to say that after lots of research, I got to a solution.

response = requests.get("https://example.com/api")
data = response.json()

Exploring the structure of the returned JSON object by looking at

data.keys()

yielded

dict_keys(['took', 'total', 'results'])

Since I am interested in "results", I created my df on the "return" key

df = pd.DataFrame(data["results"])
df.head()

This returned my expected dataframe

  • Related