Home > Software design >  Dumping data from json to csv
Dumping data from json to csv

Time:07-12

I have 100 json file, each json file contains following kind of dict format.

    {   "predictions": [
        {
          "label": "empty",
          "confidence": 1.0
        },
        {
          "label": "filled",
          "confidence": 9.40968867750501e-25
        },
        {
          "label": "no-checkbox",
          "confidence": 1.7350328516351668e-28
        }  
       ] 
    }

I would like to create a csv file and dump only

{
  "label": "empty",
  "confidence": 1.0
}

this data into csv file into prediction column along with json file name. How would I do it?

CodePudding user response:

Assuming you know already how to read the json file as dictionary in python. You could do this.

import pandas as pd
json_data =  {   "predictions": [
        {
          "label": "empty",
          "confidence": 1.0
        },
        {
          "label": "filled",
          "confidence": 9.40968867750501e-25
        },
        {
          "label": "no-checkbox",
          "confidence": 1.7350328516351668e-28
        }  
       ] 
    }

output_df = pd.DataFrame(json_data['predictions'])
print(output_df)
         label    confidence
0        empty  1.000000e 00
1       filled  9.409689e-25
2  no-checkbox  1.735033e-28

CodePudding user response:

rowIf I understand ok what you want, you want to get only the first "item" on predictions list, from multiple files on some path. And then put all this as rows on a csv. So you can do something like:

import csv
import json
from os import listdir
from os.path import isfile, join

path = 'path/to/dir'
result = []
for file_name in listdir(path):
    with open(join(path, file_name), 'r') as f:
        data = json.load(f)
        first = data['predictions'][0]
        result.append([first['label'], first['confidence']])

with open('path/to/result.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(['label', 'confidence']) # Comment this line if you dont want a header row
    writer.writerows(result)

Replacing 'path/to/dir' with the path of the json files directory, and 'path/to/result.csv' with the path to the result csv file.

  • Related