i am new to python, fetching data from web url into json format and trying to convert nested json into data frame ,after that dump into excel. But i am not getting the expected output.
JSON:
"balanceSheetHistoryQuarterly": {
"balanceSheetStatements": [
{
"intangibleAssets": {
"raw": 1557545000,
"fmt": "1.56B",
"longFmt": "1,557,545,000"
},
"capitalSurplus": {
"raw": 2197934000,
"fmt": "2.2B",
"longFmt": "2,197,934,000"
}
},
{
"intangibleAssets": {
"raw": 1547187000,
"fmt": "1.55B",
"longFmt": "1,547,187,000"
},
"capitalSurplus": {
"raw": 2197934000,
"fmt": "2.2B",
"longFmt": "2,197,934,000"
}
}
],
"maxAge": 86400
}
expected output:
have tried below code:
#f = open('file.json')
data1 = json.load(open('file.json'))
print("data1",data1)
data = data1 ['balanceSheetHistoryQuarterly']
print("dftyp",data)
data1 =pd.json_normalize(data['balanceSheetStatements'])
#data1.columns = pd.MultiIndex.from_tuples(zip(['balanceSheetStatements'], data1.columns))
print("data1",data1)
data1.to_excel("output\\DATAFILE4.xlsx")
output is :
How to make json standard and convert into e.xcel
CodePudding user response:
Try this:
import numpy as np
import pandas as pd
# Turn the nested list into a dataframe
df = pd.DataFrame(data1['balanceSheetHistoryQuarterly']['balanceSheetStatements'])
# Extract the value of 'raw' as columns
df['intangibleAssets'] = [i['raw'] for i in df['intangibleAssets']]
df['capitalSurplus'] = [i['raw'] for i in df['capitalSurplus']]
# Export to CSV
df.to_csv('your_file_name.csv')
Edit: Generalize the list comprehension for more columns.
def get_raw_for_cols(df, col_names):
for name in col_names:
df[name] = [i['raw'] if isinstance(i, dict) else i for i in df[name]]
return df