I have a list of json
file like below:
[
{"A":{"value":1}, "B":{"value":2}},
{"A":{"value":9}, "B":{"value":3}}
]
Which I want to turn to csv
like so:
A.value,B.value
1,2
9,3
The issue is that I have nested keys which have the same name : value
but should be in a separate column. I could not find an elegant solution to this anywhere yet. I would like to be able to do something like:
data = json.load(open(file, 'r'))
with open("output.csv", "w") as f:
columns = ["A.value","B.value"]
cw = csv.DictWriter(f, columns)
cw.writeheader()
cw.writerows(data)
Which I know would work if I did not have any nested keys. I found other questions similar to this but I don't think this applies to my situation.
As an extra challenge:
I'd rather keep a generic approach. Later I might have a list of jsons like:
[
{"A":{"value":1}, "B":{"value":2}, "key":90},
{"A":{"value":9}, "B":{"value":3}, "key":91}
]
Meaning not all keys I want to add to csv
will have a nested value
key!
**output ^ **
A.value,B.value,key
1,2,90
9,3,91
CodePudding user response:
Flattening the dicts worked. Since there is a list of dicts, the flattening has to be done for each dict:
import collections
def flatten(d, parent_key='', sep='.'):
items = []
for k, v in d.items():
new_key = parent_key sep k if parent_key else k
if isinstance(v, collections.MutableMapping):
items.extend(flatten(v, new_key, sep=sep).items())
else:
items.append((new_key, v))
return dict(items)
def flatten_json_list(list):
flattened_list = []
for d in list:
flattened_list.append(flatten(d))
return flattened_list
Then the code should work as implemented in the question:
with open("out.csv","w") as f:
columns = ["A.value","B.value","key"]
cw = csv.DictWriter(f, columns)
cw.writeheader()
cw.writerows(data)
CodePudding user response:
This should do the job:
cw.writerows([{f'{key}.value':val['value'] for key, val in row.items()} for row in data])
or as regular loop:
for row in data:
cw.writerow({f'{key}.value':val['value'] for key, val in row.items()})
EDIT
import csv
data = [
{"A":{"value":1}, "B":{"value":2}, "key":90},
{"A":{"value":9}, "B":{"value":3}, "key":91}
]
def parse(row):
for key, value in row.items():
try:
yield f'{key}.value', value['value']
except TypeError:
yield key, value
with open("output.csv", "w") as f:
columns = ['A.value', 'B.value', 'key']
cw = csv.DictWriter(f, columns)
cw.writeheader()
cw.writerows(dict(parse(row)) for row in data)
The only thing I don't like is the hardcoded headers