I'm fetching data from an API and trying to store it in CSV. The API returns an array of objects like this:
[
{name:'Alice',age:22, address:'123 Downtown'},
{name:'Bob',age:25, contact:'1111'}
]
To parse and store, this is what I'm currently doing:
while not allFetched:
response = callApi()
x = []
for item in response['data']:
df = pd.DataFrame.from_dict([item])
x.append(df)
new = pd.concat(x, ignore_index=True)
new.to_csv("Data.csv", mode='a', index=False, header=fileExists)
This adds the header once (if not fileExists) but messes up the column because the headers mismatch. I want to create the column header if it doesn't exist and have Nan in missing values as follows:
name,age,address,contact
Alice,22,123 Downtown,Nan
Bob,25,Nan,1111
CodePudding user response:
You can try this:
import pandas as pd
data = [
{'name':'Alice','age':22, 'address':'123 Downtown'},
{'name':'Bob','age':25, 'contact':'1111'}
]
df = pd.json_normalize(data)
df = df.fillna('Nan')
print(df)
df.to_csv("Data.csv", mode='a', index=False)
Terminal Output:
name age address contact
0 Alice 22 123 Downtown Nan
1 Bob 25 Nan 1111
CSV Output:
name,age,address,contact
Alice,22,123 Downtown,Nan
Bob,25,Nan,1111
CodePudding user response:
What you are getting is a json response. You can parse json with such structure using pd.json_normalize()
method:
response = callApi()
dta = pd.json_normalize(response['data'])
Result:
| | name | age | address | contact |
|---:|:-------|------:|:-------------|----------:|
| 0 | Alice | 22 | 123 Downtown | nan |
| 1 | Bob | 25 | nan | 1111 |
Then you can save the data frame to csv:
dta.to_csv("Data.csv", mode='a', index=False)