Home > Enterprise >  Type error : list indices must be integers or slices, not dict
Type error : list indices must be integers or slices, not dict

Time:04-09

I am trying to read json data and writing to csv file but not working. How to approach for it to get my expected output

Code :

import json
import csv 
import ast


# Using json loads to load data from varaible which holds json data 
jdata = json.loads(data)

# Fetching all headers from json and converting into list   
jkey = list(jdata[0].keys())  

# get length 
glen_count = len(data)


# Looping through the keys to write data to csv file  
for r in range(glen_count):
  for hd in jkey:
    with open("/content/sample_data/abc.csv",'w') as wr:
      csv_wr = csv.DictWriter(wr,fieldnames=jkey)
      csv_wr.writeheader()
      csv_wr.writerows(jdata[jdata[r][hd]])  # error : Type error : list indices must be integers or slices, not dict

My variable data contain json

data = '''
[

 {
  "_id": {
    "$oid": "5fc7897969b8ab14e1cad8f4"
  },
  "m": "24:6F:28:55:7C:55",
  "t": "o",
  "pId": {
    "$oid": "5fc7894ff7fa2b262ecb6c46"
  },
  "pVal": {
    "title": "Switch",
    "val": 1,
    "status": null,
    "portType": "AC",
    "portIcon": "AC",
    "_id": "5fc7894ff7fa2b262ecb6c46",
    "no": "0",
    "cat": "s",
    "device": {
      "ports": [
        "5fc7894ff7fa2b262ecb6c46"
      ],
      "_id": "5fc7894ff7fa2b262ecb6c45",
      "macId": "24:6F:28:55:7C:55",
      "deviceId": "LTAC21220050009",
      "dType": "5ee38d3a4afdf74cf8b07dfb",
      "createdAt": "2020-12-02T12:32:15.235Z",
      "updatedAt": "2020-12-02T12:32:36.773Z",
      "__v": 0,
      "assignedTo": "5fc62223f7fa2b262ecb6c2d"
    },
    "mac": "24:6F:28:55:7C:55",
    "__v": 0,
    "createdAt": "2020-12-02T12:32:15.237Z",
    "updatedAt": "2020-12-02T12:32:57.776Z"
  },
  "dId": {
    "$oid": "5fc7894ff7fa2b262ecb6c45"
  },
  "createdAt": {
    "$date": "2020-12-02T12:32:57.781Z"
  },
  "updatedAt": {
    "$date": "2020-12-02T12:32:57.781Z"
  },
  "__v": 0
},{
  "_id": {
    "$oid": "5fc7897b69b8ab14e1cad8f5"
  },
  "m": "24:6F:28:55:7C:55",
  "t": "o",
  "pId": {
    "$oid": "5fc7894ff7fa2b262ecb6c46"
  },
  "pVal": {
    "title": "Switch",
    "val": 0,
    "status": null,
    "portType": "AC",
    "portIcon": "AC",
    "_id": "5fc7894ff7fa2b262ecb6c46",
    "no": "0",
    "cat": "s",
    "device": {
      "ports": [
        "5fc7894ff7fa2b262ecb6c46"
      ],
      "_id": "5fc7894ff7fa2b262ecb6c45",
      "macId": "24:6F:28:55:7C:55",
      "deviceId": "LTAC21220050009",
      "dType": "5ee38d3a4afdf74cf8b07dfb",
      "createdAt": "2020-12-02T12:32:15.235Z",
      "updatedAt": "2020-12-02T12:32:36.773Z",
      "__v": 0,
      "assignedTo": "5fc62223f7fa2b262ecb6c2d"
    },
    "mac": "24:6F:28:55:7C:55",
    "__v": 0,
    "createdAt": "2020-12-02T12:32:15.237Z",
    "updatedAt": "2020-12-02T12:32:59.951Z"
  },
  "dId": {
    "$oid": "5fc7894ff7fa2b262ecb6c45"
  },
  "createdAt": {
    "$date": "2020-12-02T12:32:59.955Z"
  },
  "updatedAt": {
    "$date": "2020-12-02T12:32:59.955Z"
  },
  "__v": 0
}

]

Expected output is need to write to csv file : column : id , val , createdAt , updatedAt

id , val , createdAt , updatedAt
5fc7897969b8ab14e1cad8f4 , 0 , 2020-12-02T12:32:57.781Z , 2020-12-02T12:32:57.781Z
5fc7897b69b8ab14e1cad8f5 , 0 , 2020-12-02T12:32:59.955Z , 2020-12-02T12:32:59.955Z

CodePudding user response:

Okay, a few things:

  1. jkey = list(jdata[0].keys()) will give you jkeys = ['_id', 'm', 't', 'pId', 'pVal', 'dId', 'createdAt', 'updatedAt', '__v'], which is more than your desired keys. If your goal is to select only id , val , createdAt , updatedAt, jkeys must reflect that as jkeys = ['id' , 'val' , 'createdAt' , 'updatedAt']. It's the value you're passing to the fieldnames, after all. Those are the header names, and it identifies the order in which the values will be written.
  2. glen_count = len(data) is likely a typo and you meant to write jdata, data is the string containing the json data, so you would be counting the length of this string.
  3. The error you're going through is because the return of the expression jdata[jdata[r][hd]] is {'$oid': '5fc7897969b8ab14e1cad8f4'}, a dict. You are using this one as a index for the list jdata.
  4. About DictWriter and writerows: You pass to its constructor specific fieldnames, in your case, it's jkeys = ['id' , 'val' , 'createdAt' , 'updatedAt']. This is your desired header. You must then either use a list of dictionaries with writerows or a single dictionary with writerow. Writerows will have you write all instances at once, and writerow is a single instance at a time. See CSV File Reading and Writing as a reference.
  5. Careful with looping through with open("/content/sample_data/abc.csv",'w') as wr:, you'll be rewriting your own file! See Reading and Writing Files to learn about the writing modes!

I will post the following as a solution, using writerows to separate the logic of writing to a file and the logic of setting up the rows. Hopefully those are the headers desired. If not, it's easy to change according to your own needs.

import json
import csv 
import ast

jdata = json.loads(data)

# Only desired headers 
jkeys = ['id' , 'val' , 'createdAt' , 'updatedAt']

# setting up rows
list_rows = []
for el in jdata:
  dict_jdata = {}
  dict_jdata['id'] = el['_id']['$oid']
  dict_jdata['val'] = str(el['pVal']['val'])
  dict_jdata['createdAt'] = el['createdAt']['$date']
  dict_jdata['updatedAt'] = el['updatedAt']["$date"]
  list_rows.append(dict_jdata)

# Writing to a file  
with open("/content/sample_data/abc2.csv",'w') as wr:
  csv_wr = csv.DictWriter(wr,fieldnames=jkeys)
  csv_wr.writeheader()
  csv_wr.writerows(list_rows)

Hopefully this helped :)

  • Related