I have a JSON input file that looks like this:
{"nodes": [
{"properties": {
"id": "rootNode",
"name": "Bertina Dunmore"},
"nodes": [
{"properties": {
"id": 1,
"name": "Gwenneth Rylett",
"parent_id": "rootNode"},
"nodes": [
{"properties": {
"id": 11,
"name": "Joell Waye",
"parent_id": 1}},
{"properties": {
"id": 12,
"name": "Stan Willcox",
"parent_id": 1}}]},
{"properties": {
"id": 2,
"name": "Delbert Dukesbury",
"parent_id": "rootNode"},
"nodes": [
{"properties": {
"id": 21,
"name": "Cecil McKeever",
"parent_id": 2}},
{"properties": {
"id": 22,
"name": "Joy Obee",
"parent_id": 2}}]}]}]}
I want to get the nested properties
dictionaries into a (flat) list of dictionaries. Creating a recursive function that will read this dictionaries is easy:
def get_node(nodes):
for node in nodes:
print(node['properties'])
if 'nodes' in node.keys():
get_node(node['nodes'])
Now, I'm struggling to append these to a single list:
def get_node(nodes):
prop_list = []
for node in nodes:
print(node['properties'])
prop_list.append(node['properties'])
if 'nodes' in node.keys():
get_node(node['nodes'])
return prop_list
This returns [{'id': 'rootNode', 'name': 'Bertina Dunmore'}]
, even though all properties
dictionaries are printed. I suspect that this is because I'm not handling the function scope properly.
Can someone please help me get my head around this?
CodePudding user response:
You need to combine the prop_list
returned by the recursive call with the prop_list
in the current scope. For example,
def get_node(nodes):
prop_list = []
for node in nodes:
print(node['properties'])
prop_list.append(node['properties'])
if 'nodes' in node.keys():
prop_list.extend(get_node(node['nodes']))
return prop_list
CodePudding user response:
your problem is that every time you call get_node, the list where you append is initialized again. you can avoid this by passing the list to append in the recursive function
Moreover, I think would be nice to use dataclass to deal with this problem,
from dataclasses import dataclass
@dataclass
class Property:
id: int
name: str
parent_id: str = None
def explore_json(data, properties: list=None):
if properties is None:
properties = []
for key, val in data.items():
if key == "nodes":
for node in val:
explore_json(node, properties)
elif key == "properties":
properties.append(Property(**val))
return properties
explore_json(data)
output
[Property(id='rootNode', name='Bertina Dunmore', parent_id=None),
Property(id=1, name='Gwenneth Rylett', parent_id='rootNode'),
Property(id=11, name='Joell Waye', parent_id=1),
Property(id=12, name='Stan Willcox', parent_id=1),
Property(id=2, name='Delbert Dukesbury', parent_id='rootNode'),
Property(id=21, name='Cecil McKeever', parent_id=2),
Property(id=22, name='Joy Obee', parent_id=2)]
CodePudding user response:
With that:
def get_node(prop_list, nodes):
for node in nodes:
print(node['properties'])
prop_list.append(node['properties'])
if 'nodes' in node.keys():
get_node(prop_list, node['nodes'])
You can just do:
prop_list = []
get_node(prop_list, <yourdictnodes>)
Should alter prop_list
into:
{'id': 'rootNode', 'name': 'Bertina Dunmore'}
{'id': 1, 'name': 'Gwenneth Rylett', 'parent_id': 'rootNode'}
{'id': 11, 'name': 'Joell Waye', 'parent_id': 1}
{'id': 12, 'name': 'Stan Willcox', 'parent_id': 1}
{'id': 2, 'name': 'Delbert Dukesbury', 'parent_id': 'rootNode'}
{'id': 21, 'name': 'Cecil McKeever', 'parent_id': 2}
{'id': 22, 'name': 'Joy Obee', 'parent_id': 2}