How to merge colms in python dictionary where particular fields are same?-CodePudding

I have list of dictionaries as u can see below:

[{
  'name': 'First Filter',
  'nut': '122222',
  'cpv': '111'
},
{
  'name': 'First Filter',
  'nut': '122222',
  'cpv': '123'
},
{
  'name': 'First Filter',
  'nut': '123-41',
  'cpv': '111'
},
{
  'name': 'First Filter',
  'nut': '123-41',
  'cpv': '123'
},
{
  'name': 'second Filter',
  'nut': '123-41',
  'cpv': '123'
}
]

I want results like this:

{
  'name': 'First Filter',
  'nut': ['122222', '123-41'],
  'cpv': ['111', '123']
},
{
  'name': 'second Filter',
  'nut': ['123-41'],
  'cpv': ['123']
}
]

Please help me, i tried to do this by pandas dataframe but couldn't get it! I want to compare the objects on the base of filter name and then want to combine cpv and nuts with same name

CodePudding user response：

I chose set instead of list, but you can replace set with list and set.add with list.append, respectively.

from collections import defaultdict
from pprint import pprint

input_list = [
    {
      'name': 'First Filter',
      'nut': '122222',
      'cpv': '111'
    },
    {
      'name': 'Second Filter',
      'nut': '122222',
      'cpv': '123'
    },
    {
      'name': 'First Filter',
      'nut': '123-41',
      'cpv': '111'
    },
        {
      'name': 'Second Filter',
      'nut': '123-41',
      'cpv': '123'
    }
]

res = defaultdict(lambda: defaultdict(set))
for d in input_list:
    for k in [x for x in d.keys() if x != 'name']:
        res[d['name']][k].add(d[k])
pprint(res)

Output:

defaultdict(<function <lambda> at 0x7fba735681f0>,
            {'First Filter': defaultdict(<class 'set'>,
                                         {'cpv': {'111'},
                                          'nut': {'123-41', '122222'}}),
             'Second Filter': defaultdict(<class 'set'>,
                                          {'cpv': {'123'},
                                           'nut': {'123-41', '122222'}})})

I used collections.defaultdict because in the standard Python dict I would have to check every time if a key was already present or not.

You can learn more about collections.defaultdict from here.

CodePudding user response：

image_here

this is how it looks in a dataframe , looking for this ?