Home > database >  Transform data into excel from deeply nested json
Transform data into excel from deeply nested json

Time:01-19

i am new to python, fetching data from web url into json format and trying to convert nested json into data frame ,after that dump into excel. But i am not getting the expected output.

JSON:

"balanceSheetHistoryQuarterly": {
  "balanceSheetStatements": [
    {
      "intangibleAssets": {
        "raw": 1557545000,
        "fmt": "1.56B",
        "longFmt": "1,557,545,000"
      },
      "capitalSurplus": {
        "raw": 2197934000,
        "fmt": "2.2B",
        "longFmt": "2,197,934,000"
      }
    },
    {
      "intangibleAssets": {
        "raw": 1547187000,
        "fmt": "1.55B",
        "longFmt": "1,547,187,000"
      },
      "capitalSurplus": {
        "raw": 2197934000,
        "fmt": "2.2B",
        "longFmt": "2,197,934,000"
      }
    }
  ],
  "maxAge": 86400
}

expected output:

enter image description here

have tried below code:

#f = open('file.json')
data1 = json.load(open('file.json'))
print("data1",data1)
data = data1 ['balanceSheetHistoryQuarterly']
print("dftyp",data)
data1 =pd.json_normalize(data['balanceSheetStatements'])
#data1.columns = pd.MultiIndex.from_tuples(zip(['balanceSheetStatements'], data1.columns))
print("data1",data1)
data1.to_excel("output\\DATAFILE4.xlsx")

output is :

enter image description here

How to make json standard and convert into e.xcel

CodePudding user response:

Try this:

import numpy as np
import pandas as pd

# Turn the nested list into a dataframe
df = pd.DataFrame(data1['balanceSheetHistoryQuarterly']['balanceSheetStatements'])

# Extract the value of 'raw' as columns
df['intangibleAssets'] = [i['raw'] for i in df['intangibleAssets']]
df['capitalSurplus'] = [i['raw'] for i in df['capitalSurplus']]

# Export to CSV
df.to_csv('your_file_name.csv')

Edit: Generalize the list comprehension for more columns.

def get_raw_for_cols(df, col_names):
   for name in col_names: 
      df[name] = [i['raw'] if isinstance(i, dict) else i for i in df[name]]
   return df
  • Related