Home > Blockchain >  How to get a list of every bottom level's path (as a list of keys) in a nested dictionary
How to get a list of every bottom level's path (as a list of keys) in a nested dictionary

Time:04-09

Say I have a dictionary as such:

d = {'a': {'values':val}, 'b':{'fin_b': {'values':val}}, 'c':{'ca':{'fin_ca':{'values':val}}, 'cb':{'fin_cb':{'values':val}}}, 'd':{'da':{'dda':{'fin_dda':{'values':val}}}}}

I want to create a list containing a list for each bottom-level item of all the keys to that item. In other words, a list of lists of all keys of all items in the nested dictionary, without knowing how many levels are in the dictionary before-hand. One of the caveats is it would ideally ignore the "values" key in each bottom-level item. In other words the correct output for d would be:

output = [['a'], ['b', 'fin_b'], ['c', 'ca', 'fin_ca'], ['c', 'cb', 'fin_cb'], ['d', 'da', 'dda', 'fin_dda']]

but:

output = [['a', 'values'], ['b', 'fin_b', 'values'], ['c', 'ca', 'fin_ca', 'values'], ['c', 'cb', 'fin_cb', 'values'], ['d', 'da', 'dda', 'fin_dda', 'values']]

would be fine too, it's the retrieving keys algorithm im struggling with.

Ive tried the following:

def get_paths(self):
  data = self.data
  groupings = []
  group = []
  def convert(d):
    nonlocal group
    for k in d.keys():
      if isinstance(d[k], dict) :
        group.append(k)
        yield from (x for x in convert(d[k]))
      else:
        group_ = group.copy()
        group = []
        yield group_

  for item in convert(data):
    groupings.append(item)
  return groupings

but I get:

output = [['a'], ['b', 'fin_b'], ['c', 'ca', 'fin_ca'], ['cb', 'fin_cb'], ['d', 'da', 'dda', 'fin_dda']]

due to resetting the list "group" at the last level. I've also tried:

  def get_paths(self):
    data = self.data
    groupings = []
    group = []
    def convert(d):
      for k in d.keys():
        if isinstance(d[k], dict) and 'values' not in v.keys():
          yield from ([k,x] for x in convert(d[k]))
        elif 'values'  in d.keys():
          yield k

    for item in convert(data):
      groupings.append(item)
    return groupings

but then the issue is it returns a nested list, which i cant get round:

output = ['a', ['b', 'fin_b'], ['c', ['ca', 'fin_ca']], ['c', ['cb', 'fin_cb']], ['d', ['da', ['dda', 'fin_dda']]]]

Any advice would be greatly appreciated.

CodePudding user response:

This gives the desired output:

def get_paths(dictionary, current_path=None, paths_found=None):
    if current_path is None:
        current_path = []
    if paths_found is None:
        paths_found = []
    for key, value in dictionary.items():
        if isinstance(value, dict):
            get_paths(value, current_path   [key], paths_found)
        else:
            if current_path:
                paths_found.append(current_path)
    return paths_found

if __name__ == "__main__":
    d = {'a': {'values': "val"}, 'b': {'fin_b': {'values': "val"}},
         'c': {'ca': {'fin_ca': {'values': "val"}}, 'cb': {'fin_cb': {'values': "val"}}},
         'd': {'da': {'dda': {'fin_dda': {'values': "val"}}}}}
    print(get_paths(d))

CodePudding user response:

You can approach this problem with a two step solution:

  1. you can retrieve the full path including the leaf node in this case {"values": val}
  2. you can filter out the leaf node.

We can employ iterators:

to solve the first point a recursive iterator is enough:

def iter_path(dict_in, prefix=None):
    if prefix is None:
        prefix = list()
    for key, value in dict_in.items():
        if not isinstance(value, dict):
            yield prefix   [key]
        else:
            yield from iter_path(value, prefix   [key])

For the second point we can just drop the last entry of the yielded lists:

def iter_path_not_last(dict_in):
    for path in iter_path(dict_in, prefix=None):
        yield p[:-1]

Finally

paths = [p for p in iter_path_not_last(d)]
  • Related