I've written a function which should be able to search a nested dictionary, using DFS, to find a specific value. The recursive element seems to be working fine, however, when the base case should return True, it simply doesn't.
obj = {'a': [{'c':'d'}, {'e':'f'}],
'b': [{'g':'h'}, {'i':'j'}]}
def obj_dfs(obj, target):
if type(obj) == dict:
for key, val in obj.items():
obj_dfs(key, target)
obj_dfs(val, target)
elif type(obj) in (list, tuple):
for elem in obj:
obj_dfs(elem, target)
else:
if obj == target:
print(f"{obj} == {target}")
return True
else:
print(f"{obj} != {target}")
return False
obj_dfs(obj, 'j')
And the results. As you can see, the standard output "i==i" shows that this element was evaluated correctly but the return True statement isn't functioning as intended. I've verified that if I call obj_dfs(obj, 'j')
, that experiences the same error.
a != j
c != j
d != j
e != j
f != j
b != j
g != j
h != j
i != j
j == j
False
Why is this? And how can I fix this?
CodePudding user response:
As the comments point out, you need to return the results of the recursive calls. Since you are just looking for a True/False match, you can pass the recursive calls into any()
which will exit early with True
if there is a match. The base case can simple be whether obj == target
.
obj = {'a': [{'c':'d'}, {'e':'f'}],
'b': [{'g':'h'}, {'i':'j'}]}
def obj_dfs(obj, target):
if obj == target:
return True
if isinstance(obj, dict):
return any(obj_dfs(v, target) for v in obj.items())
elif isinstance(obj, (list, tuple)):
return any(obj_dfs(l, target) for l in obj)
return False
obj_dfs(obj, 'i'), obj_dfs(obj, 'j'), obj_dfs(obj, 'd'), obj_dfs(obj, 'x')
# (True, True, True, False)
This allows for a three simple blocks. Notice we are checking for a tuple
as well as a list
in the last isinstance
. This allows you to simply pass in the dict item()
s rather than looping over keys and values independently.
Adding a print(obj)
as the first line of the function will show the order in which you are traversing the data. For example obj_dfs(obj, 'j')
will print:
{'a': [{'c': 'd'}, {'e': 'f'}], 'b': [{'g': 'h'}, {'i': 'j'}]}
('a', [{'c': 'd'}, {'e': 'f'}])
a
[{'c': 'd'}, {'e': 'f'}]
{'c': 'd'}
('c', 'd')
c
d
{'e': 'f'}
('e', 'f')
e
f
('b', [{'g': 'h'}, {'i': 'j'}])
b
[{'g': 'h'}, {'i': 'j'}]
{'g': 'h'}
('g', 'h')
g
h
{'i': 'j'}
('i', 'j')
i
j
CodePudding user response:
I made some edits to your code
obj = {'a': [{'c':'d'}, {'e':'f'}],
'b': [{'g':'h'}, {'i':'j'}]}
def obj_dfs(obj, target):
if type(obj) == dict:
for key, val in obj.items():
if(key==target):
return val
else:
result=obj_dfs(val, target)
if result!=None: return result
elif type(obj) in (list, tuple):
for elem in obj:
result=obj_dfs(elem, target)
if result!=None: return result
else:
if obj==target: return True
print(obj_dfs(obj, 'i'))
I don't know why you would just return true instead of the value though, so I put it that if its a dictionary key it would return the value, instead it would return true, to show that it is found
CodePudding user response:
Expanding on my comment, try this, where we pass return values up the chain and always return True
if a child returned True
:
obj = {'a': [{'c':'d'}, {'e':'f'}],
'b': [{'g':'h'}, {'i':'j'}]}
obj2 = {'a': [{'c':'d'}, {'e':'f'}],
'b': [{'g':'h'}, {'g':'j'}]}
def obj_dfs(obj, target):
if type(obj) == dict:
for key, val in obj.items():
keyret = obj_dfs(key, target)
valueret = obj_dfs(val, target)
if keyret is True or valueret is True:
return True
else:
return False
elif type(obj) in (list, tuple):
rets = []
for elem in obj:
rets.append(obj_dfs(elem, target))
if True in rets:
return True
else:
return False
else:
if obj == target:
print(f"{obj} == {target}")
return True
else:
print(f"{obj} != {target}")
return False
print(obj_dfs(obj, 'i'))
print(obj_dfs(obj2, 'i'))
CodePudding user response:
Recursion is a functional heritage and so using it with functional style yields the best results. This means decoupling concerns and pushing side effects to the fringe of your program. obj_dfs
performs a depth-first traversal and mixes in search logic. And for purposes of debugging, includes a print
side effect.
Decomposition results in functions that are easier to write, test, and reuse in various parts of our program. We'll start with a generic dfs
for traversal -
def dfs(t, path = []):
if isinstance(t, dict):
for key, val in t.items():
yield from dfs(val, [*path, key])
elif isinstance(t, (list, tuple)):
for key, val in enumerate(t):
yield from dfs(val, [*path, key])
else:
yield path, t
obj = {'a': [{'c':'d'}, {'e':'f'}],
'b': [{'g':'h'}, {'i':'j'}]}
for path, val in dfs(obj):
print(path, val) # side effect decided by caller
['a', 0, 'c'] d
['a', 1, 'e'] f
['b', 0, 'g'] h
['b', 1, 'i'] j
The suggested solution and other answers here collapse the semantic difference between key and value, providing no differentiation on the particular match of target
. Writing dfs
as we did above, we can know which part of obj
matched.
keys | values |
---|---|
['a', 0, 'c'] | d |
['a', 1, 'e'] | f |
['b', 0, 'g'] | h |
['b', 1, 'i'] | j |
has_value
and has_key
are easily defined in terms of dfs
-
def has_value(t, target):
for path, val in dfs(t):
if val == target:
return True
return False
def has_key(t, target):
for path, val in dfs(t):
if target in path:
return True
return False
print(has_value(obj, "j")) # True
print(has_key(obj, "j")) # False
print(has_value(obj, "i")) # False
print(has_key(obj, "i")) # True