Home > Net >  Convert nested list to nested dictionary without duplicates and NAN
Convert nested list to nested dictionary without duplicates and NAN

Time:02-12

I'm pretty new to python and would like to ask for your help or idea in converting nested list to dictionary.

Here is a sample data:

    l3 = [['A', 'A-1', 'A-1-1'],
          ['A', 'A-1', 'A-1-2'],
          ['A', 'B-1', 'B-1-1'],
          ['A', 'B-2', nan]]

the expected output is like this:

    output = {
        'prod': 'A',
        'item': [
                    {
                    'prod': 'A-1',
                    'item': [
                                {
                                    'prod': 'A-1-1'
                                    'item': []
                                },
                                {
                                    'prod': 'A-1-2'
                                    'item': []
                                }
                            ]
                     },
                    {
                    'prod': 'B-1',
                    'item': [
                                {
                                    'prod': 'B-1-1'
                                    'item': []
                                }
                            ]
                    },
                    {
                    'prod': 'B-2',
                    'item': []
                    }
                ]
        }

I've tried following this link but I am having difficulty to implement it. Python: Combine several nested lists into a dictionary

Is there a way to do this? Thank you in advance for helping me.

CodePudding user response:

This isn't exactly a 'nested dictionary', in the same sense as the linked question: in their version, looking up a key sequence of length k will take O(k) time on average, since they're using the mapping and lookup functionality of dictionaries. The size of the data structure doesn't asymptotically affect performance.

In your data structure, looking up the a key of length k will take O(k*L) time, where L is the maximum number of dictionaries on the same level. This is equivalent (performance-wise) to using only nested lists. As you get more data, performance will worsen (albeit not dramatically).

If you really want to keep your current structure, this code will give matching output to yours:

def parse_nested_list(nested_list):
    # First element is always A
    base_dict = {'prod': 'A', 'item': []}

    for sub_list in nested_list:
        current_dict_list = base_dict['item']

        # Skip first element
        for element in sub_list[1:]:
            # Break on NaN
            if is_nan(element):
                break

            # Search for existing dictionary with 'prod' value match
            for sub_dict in current_dict_list:
                if sub_dict.get('prod', None) == element:
                    current_dict_list = sub_dict['item']
                    break
            else:  # None were found: create new dict
                current_dict_list.append({'prod': element, 'item': []})
                current_dict_list = current_dict_list[-1]['item']

    return base_dict

However, if you have a choice, using a similar approach to the linked thread is probably closer to what you're looking for, and the code is simpler.

For example:

def parse_nested_list_better(nested_list):
    base_dict = {}

    for sub_list in nested_list:
        current_dict = base_dict
        for element in sub_list:
            # Break on NaN
            if is_nan(element):
                break
            if element not in current_dict:
                current_dict[element] = {}

            current_dict = current_dict[element]

    return base_dict

This will give you the following output:

output = {'A':
              {'A-1':
                     {'A-1-1': {},
                      'A-1-2': {}
                      },
         
               'B-1': 
                     {'B-1-1': {}
                      },
            
               'B-2': {}
          
              }
          }
  • Related