I have this dictionary that I am trying to iterate through recursively. When I hit a matching node match
I want to return that node which is a list
.
Currently with my code I keep on getting an empty list
. I have stepped through the code and I see my check condition being hit, but the recursion still returns an empty value. what am I doing wrong here? thanks
dictionary data:
{
"apiVersion": "v1",
"kind": "Deployment",
"metadata": {
"name": "cluster",
"namespace": "namespace",
},
"spec": {
"template": {
"metadata": {
"labels": {
"app": "flink",
"cluster": "repo_name-cluster",
"component": "jobmanager",
"track": "prod",
}
},
"spec": {
"containers": [
{
"name": "jobmanager",
"image": "IMAGE_TAG_",
"imagePullPolicy": "Always",
"args": ["jobmanager"],
"resources": {
"requests": {"cpu": "100.0", "memory": "100Gi"},
"limits": {"cpu": "100.0", "memory": "100Gi"},
},
"env": [
{
"name": "ADDRESS",
"value": "jobmanager-prod",
},
{"name": "HADOOP_USER_NAME", "value": "yarn"},
{"name": "JOB_MANAGER_MEMORY", "value": "1000m"},
{"name": "HADOOP_CONF_DIR", "value": "/etc/hadoop/conf"},
{
"name": "TRACK",
"valueFrom": {
"fieldRef": {
"fieldPath": "metadata.labels['track']"
}
},
},
],
}
]
},
},
},
}
code:
test = iterdict(data, "env")
print(test)
def iterdict(data, match):
output = []
if not isinstance(data, str):
for k, v in data.items():
print("key ", k)
if isinstance(v, dict):
iterdict(v, match)
elif isinstance(v, list):
if k.lower() == match.lower():
# print(v)
output = v
return output
else:
for i in v:
iterdict(i, match)
return output
expected return value:
[{'name': 'JOB_MANAGER_RPC_ADDRESS', 'value': 'repo_name-cluster-jobmanager-prod'}, {'name': 'HADOOP_USER_NAME', 'value': 'yarn'}, {'name': 'JOB_MANAGER_MEMORY', 'value': '1000m'}, {'name': 'HADOOP_CONF_DIR', 'value': '/etc/hadoop/conf'}, {'name': 'TRACK', 'valueFrom': {...}}]
CodePudding user response:
When you recurse to iterdict
, you're simply throwing away the return value. Thus, since every value in the top level of your dictionary is either a string or a dict, you will end up just returning an empty list.
You probably want to append the recursive outputs:
output = iterdict(v, match)
and
output = iterdict(i, match)
However, this is potentially inefficient as you will build a lot of intermediate lists. A better strategy might be to make your function a generator; the name iterdict
would suggest this anyway. To do so, get rid of your output
variable and the return
statements, and use yield
instead:
yield from iterdict(v, match)
yield from v
yield from iterdict(i, match)
and then, at the top level, you can just iterate over your results:
for value in iterdict(data, "env"):
...
or, if you really need a list, collect the generator output into a list:
test = list(iterdata(data, "env"))
This will likely be faster (no intermediate lists) and more Pythonic.
CodePudding user response:
You are not updating the output to output list when you are running it recursively. You can either append the output or use yield keyword to make use of generators in python. Return creates temporary lists which are memry intensive and impedes performance when you are running it recursively. Thats why use generators.
def iterdict(data, match):
if isinstance(data, str):
return []
for k, v in data.items():
if isinstance(v, dict):
yield from iterdict(v, match)
elif isinstance(v, list):
if k.lower() == match.lower():
yield from v
for i in v:
yield from iterdict(i, match)
test = list(iterdict(data, "env"))
print(test)
CodePudding user response:
The issue with your code is that you are not updating the output list with the recursive calls. When you call iterdict recursively, it returns an updated list, but you are not assigning it to output. Instead, you should update output with the returned list like this:
def iterdict(data, match):
output = []
if not isinstance(data, str):
for k, v in data.items():
print("key ", k)
if isinstance(v, dict):
output = iterdict(v, match)
elif isinstance(v, list):
if k.lower() == match.lower():
# print(v)
output = v
return output
else:
for i in v:
output = iterdict(i, match)
return output
test = iterdict(data, "env")
print(test)