Say I have a dictionary as such:
d = {'a': {'values':val}, 'b':{'fin_b': {'values':val}}, 'c':{'ca':{'fin_ca':{'values':val}}, 'cb':{'fin_cb':{'values':val}}}, 'd':{'da':{'dda':{'fin_dda':{'values':val}}}}}
I want to create a list containing a list for each bottom-level item of all the keys to that item. In other words, a list of lists of all keys of all items in the nested dictionary, without knowing how many levels are in the dictionary before-hand. One of the caveats is it would ideally ignore the "values" key in each bottom-level item. In other words the correct output for d would be:
output = [['a'], ['b', 'fin_b'], ['c', 'ca', 'fin_ca'], ['c', 'cb', 'fin_cb'], ['d', 'da', 'dda', 'fin_dda']]
but:
output = [['a', 'values'], ['b', 'fin_b', 'values'], ['c', 'ca', 'fin_ca', 'values'], ['c', 'cb', 'fin_cb', 'values'], ['d', 'da', 'dda', 'fin_dda', 'values']]
would be fine too, it's the retrieving keys algorithm im struggling with.
Ive tried the following:
def get_paths(self):
data = self.data
groupings = []
group = []
def convert(d):
nonlocal group
for k in d.keys():
if isinstance(d[k], dict) :
group.append(k)
yield from (x for x in convert(d[k]))
else:
group_ = group.copy()
group = []
yield group_
for item in convert(data):
groupings.append(item)
return groupings
but I get:
output = [['a'], ['b', 'fin_b'], ['c', 'ca', 'fin_ca'], ['cb', 'fin_cb'], ['d', 'da', 'dda', 'fin_dda']]
due to resetting the list "group" at the last level. I've also tried:
def get_paths(self):
data = self.data
groupings = []
group = []
def convert(d):
for k in d.keys():
if isinstance(d[k], dict) and 'values' not in v.keys():
yield from ([k,x] for x in convert(d[k]))
elif 'values' in d.keys():
yield k
for item in convert(data):
groupings.append(item)
return groupings
but then the issue is it returns a nested list, which i cant get round:
output = ['a', ['b', 'fin_b'], ['c', ['ca', 'fin_ca']], ['c', ['cb', 'fin_cb']], ['d', ['da', ['dda', 'fin_dda']]]]
Any advice would be greatly appreciated.
CodePudding user response:
This gives the desired output:
def get_paths(dictionary, current_path=None, paths_found=None):
if current_path is None:
current_path = []
if paths_found is None:
paths_found = []
for key, value in dictionary.items():
if isinstance(value, dict):
get_paths(value, current_path [key], paths_found)
else:
if current_path:
paths_found.append(current_path)
return paths_found
if __name__ == "__main__":
d = {'a': {'values': "val"}, 'b': {'fin_b': {'values': "val"}},
'c': {'ca': {'fin_ca': {'values': "val"}}, 'cb': {'fin_cb': {'values': "val"}}},
'd': {'da': {'dda': {'fin_dda': {'values': "val"}}}}}
print(get_paths(d))
CodePudding user response:
You can approach this problem with a two step solution:
- you can retrieve the full path including the leaf node in this case
{"values": val}
- you can filter out the leaf node.
We can employ iterators:
to solve the first point a recursive iterator is enough:
def iter_path(dict_in, prefix=None):
if prefix is None:
prefix = list()
for key, value in dict_in.items():
if not isinstance(value, dict):
yield prefix [key]
else:
yield from iter_path(value, prefix [key])
For the second point we can just drop the last entry of the yielded lists:
def iter_path_not_last(dict_in):
for path in iter_path(dict_in, prefix=None):
yield p[:-1]
Finally
paths = [p for p in iter_path_not_last(d)]