Home > Enterprise >  Delete items of a dictionary using a list as condition
Delete items of a dictionary using a list as condition

Time:11-09

I have a dictionary like this:

features_id = {
     id1: [a, b, c, d],
     id2: [c, d],
     id3: [a, e, f, d, g, k],
     ...
}

I have also a list of values I want to create a new dictionary. Something like this:

list_of_values = [a, c]

Goal to achieve:

I want a new dictionary like this:

new_dict = {
    id1: [a, c],
    id2: [c],
    id3: [a],
    ...
}

CodePudding user response:

for such a large dataset (1M) it might have sence to use pandas and numpy. i'm not sure about the speed in this case but you can try the following:

import pandas as pd
import numpy as np

features_id = {
     'id1': ['a', 'b', 'c', 'd'],
     'id2': ['c', 'd'],
     'id3': ['a', 'e', 'f', 'd', 'g', 'k'],
     'id4': ['e', 'f', 'd', 'g', 'k']}

list_of_values = ['a', 'c']

y = np.array(list_of_values)

def filt(x):
    x = np.array(x)
    return x[np.isin(x,y)].tolist()


pd.Series(features_id).map(filt).to_dict()

>>> out
'''
{'id1': ['a', 'c'], 'id2': ['c'], 'id3': ['a'], 'id4': []}

CodePudding user response:

I write this answer for future users that will have a similar problem.

As written in comments above the solution to this answer is:

set_of_values = set(list_of_values)    
new_dict = {k:[x for x in v if x in set_of_values] for k, v in features_id.items()}

Using a set instead of a list speeds up computations a lot, especially in my case where I have to compare 1M dictionary keys, taking a few seconds instead of minutes.

CodePudding user response:

For each item of the initial dictionary you have to search for each item of the feature list if the element is contained in the item. If yes, then add it.

At the first add of a key in the new dictionary, you have to create the value, at the others, you have to append it to the existent one.

new_dict = {}
for key, value in features_id.items():
    for val in list_of_values:
        if val in value:
            if key not in new_dict:
                new_dict[key] = [val]
            else:
                new_dict[key].append(val)
  • Related