Home > database >  How to convert json to csv python with proper headers
How to convert json to csv python with proper headers

Time:11-24

Trying to get Json data to csv i am getting the values but one block is showing as one line in result, new to python so any help appriciated. Have tried the below code to do the same.

import pandas as pd

with open(r'C:\Users\anath\hard.json', encoding='utf-8') as inputfile:
    df = pd.read_json(inputfile)

    df.to_csv(r'C:\Users\anath\csvfile.csv', encoding='utf-8', index=True)

Sample Json in the source file, short snippet

{
  "issues": [
    {
      "issueId": 110052,
      "revision": 84,
      "definitionId": "DNS1012",
      "subject": "urn:h:domain:fitestdea.com",
      "subjectDomain": "fitestdea.com",
      "title": "Nameserver name doesn\u0027t resolve to an IPv6 address",
      "category": "DNS",
      "severity": "low",
      "cause": "urn:h:domain:ns1.gname.net",
      "causeDomain": "ns1.gname.net",
      "open": true,
      "status": "active",
      "auto": true,
      "autoOpen": true,
      "createdOn": "2022-09-01T02:29:09.681451Z",
      "lastUpdated": "2022-11-23T02:26:28.785601Z",
      "lastChecked": "2022-11-23T02:26:28.785601Z",
      "lastConfirmed": "2022-11-23T02:26:28.785601Z",
      "details": "{}"
    },
    {
      "issueId": 77881,
      "revision": 106,
      "definitionId": "DNS2001",
      "subject": "urn:h:domain:origin-mx.stagetest.test.com.test.com",
      "subjectDomain": "origin-mx.stagetest.test.com.test.com",
      "title": "Dangling domain alias (CNAME)",
      "category": "DNS",
      "severity": "high",
      "cause": "urn:h:domain:origin-www.stagetest.test.com.test.com",
      "causeDomain": "origin-www.stagetest.test.com.test.com",
      "open": true,
      "status": "active",
      "auto": true,
      "autoOpen": true,
      "createdOn": "2022-08-10T09:34:36.929071Z",
      "lastUpdated": "2022-11-23T09:33:32.553663Z",
      "lastChecked": "2022-11-23T09:33:32.553663Z",
      "lastConfirmed": "2022-11-23T09:33:32.553663Z",
      "details": "{\"@type\": \"hardenize/com.hardenize.schemas.dns.DanglingProblem\", \"rrType\": \"CNAME\", \"rrDomain\": \"origin-mx.stagetest.test.com.test.com\", \"causeDomain\": \"origin-www.stagetest.test.com.test.com\", \"danglingType\": \"nxdomain\", \"rrEffectiveDomain\": \"origin-mx.stagetest.test.com.test.com\"}"
    }
    }
  ]
}

Output i am getting is as below was looking a way where could field name in header and values in a column or cell so far getting the entire record in 1 cell. Any way we can just get specific field only like title, severity or issueid not everything but only the feilds i need. enter image description here

CodePudding user response:

Try:

import json
import pandas as pd

with open("your_file.json", "r") as f_in:
    data = json.load(f_in)

df = pd.DataFrame(data["issues"])
print(df[["title", "severity", "issueId"]])

Prints:

                                                title severity  issueId
0  Nameserver name doesn't resolve to an IPv6 address      low   110052
1                       Dangling domain alias (CNAME)     high    77881

To save as CSV you can do:

df[["title", "severity", "issueId"]].to_csv('data.csv', index=False)

CodePudding user response:

try this...

df = pd.json_normalize(inputfile)

in place of the line you have.

CodePudding user response:

Finally this worked for me @Andrej Kesely thanks for the inputs. sharing as might help others.

import pandas as pd
import json

with open(r'C:\Users\anath\hard.json', encoding='utf-8') as inputfile:
    data = json.load(inputfile)


df = pd.DataFrame(data["issues"])
print(df[["title", "severity", "issueId"]])
df[["title", "severity", "issueId"]].to_csv('data.csv', index=False)
  • Related