I have a list of dictionaries in python which looks like below
list = [{'entityType': 'source', 'databaseName': 'activities', 'type': 'POSTGRES', 'children': [{'id': '3c144414-0c73-41df-9f0e-4dd7cb5af46e',
'path': ['Activities (DEV)', 'public'],
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False},
{'entityType': 'source', 'databaseName': 'pd-prod-dev', 'type': 'POSTGRES', 'children':
[{'id': '75d84ead-a9fe-4949-bc21-d4deb34e1ae1',
'path': ['pg-prd (DEV-RR)', 'pghero'],
'tag': 'PWGqdrkcD08=',
'type': 'CONTAINER',
'containerType': 'FOLDER'},
{'id': 'facc2c20-7561-430f-ac35-547b5bc7a92f',
'path': ['pg-prd (DEV-RR)', 'public'],
'tag': 'gcUL0NTOc 4=',
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False},
{'entityType': 'source', 'databaseName': 'pd-prod-prd', 'type': 'POSTGRES', 'children':
[{'id': '75d84ead-a9fe-4949-bc21-d4deb34e1ae1',
'path': ['pg-prd (PRD-RR)', 'pghero'],
'tag': 'PWGqdrkcD08=',
'type': 'CONTAINER',
'containerType': 'FOLDER'},
{'id': 'facc2c20-7561-430f-ac35-547b5bc7a92f',
'path': ['pg-prd (PRD-RR)', 'public'],
'tag': 'gcUL0NTOc 4=',
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False}]
This is just a sample. The actual list has a list of 30 dictionaries. What I am trying to do is filter out the dictionaries where the nested children dictionary has only ' public' schema in it. So my expected output would be
public_list = [{'entityType': 'source', 'databaseName': 'activities', 'type': 'POSTGRES', 'children': [{'id': '3c144414-0c73-41df-9f0e-4dd7cb5af46e',
'path': ['Activities (DEV)', 'public'],
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False},
{'entityType': 'source', 'databaseName': 'pd-prod-dev', 'type': 'POSTGRES', 'children':
[{'id': 'facc2c20-7561-430f-ac35-547b5bc7a92f',
'path': ['pg-prd (DEV-RR)', 'public'],
'tag': 'gcUL0NTOc 4=',
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False},
{'entityType': 'source', 'databaseName': 'pd-prod-prd', 'type': 'POSTGRES', 'children':
[{'id': 'facc2c20-7561-430f-ac35-547b5bc7a92f',
'path': ['pg-prd (PRD-RR)', 'public'],
'tag': 'gcUL0NTOc 4=',
'type': 'CONTAINER',
'containerType': 'FOLDER'}]'checkTableAuthorizer': False}]
I tried accessing the nested dict children by iterating but unable to filter out what condition to use
for d in list:
for k, v in d.items():
if k == 'children':
print(v)
I would love to apply this as a function since I'll be reusing it on a pandas column of list of dicts
CodePudding user response:
You could create a function that gets the public data for children
of each entry:
def get_public_data(data):
result = []
children = data.get("children")
if children:
for row in children:
path = row.get("path")
if path and "public" in path:
result.append(row)
return result
And then create a new list of entries where you just replace the children
key:
public_list = []
for x in entities:
public_data = get_public_data(x)
if public_data:
public_list.append({**x, "children": public_data})
Combine these two and you'll get the function you need.
CodePudding user response:
IIUC you want to collect the entries were all items have a public schema?
Assuming your 'children' keys are always valid and a tuple of 2 elements, you can use a simple comprehension:
[d for d in lst
if all(e['path'][1] == 'public' for e in d['children'])
]
NB. I called your input lst
as list
is a python builtin