Home > database >  Remove duplicates from nested dictionary with python
Remove duplicates from nested dictionary with python

Time:12-03

I have this nested dictionary. Dictionary's name was generated.

{
        "sphere": {
            1: "False",
            2: "False",
            3: "False",
            4: "True",
            5: "True",
            6: "False",
            7: "False",
            8: "False",
            9: "False",
        },
        "cube": {
            1: "True",
            2: "True",
            3: "False",
            4: "False",
            5: "False",
            6: "True",
            7: "True",
            8: "False",
            9: "False",
        },
        "torus": {
            1: "True",
            2: "True",
            3: "True",
            4: "False",
            5: "False",
            6: "False",
            7: "False",
            8: "True",
            9: "True",
        },
    }

I want to delete all duplicated values, but keep the first and the last equal values inside the dictionary, and get something like that as the result:

{
    "sphere": {
        1: "False",
        3: "False",
        4: "True",
        5: "True",
        6: "False",
        9: "False",
    },
    "cube": {
        1: "True",
        2: "True",
        3: "False",
        5: "False",
        6: "True",
        7: "True",
        8: "False",
        9: "False",
    },
    "torus": {
        1: "True",
        3: "True",
        4: "False",
        7: "False",
        8: "True",
        9: "True",
    },
}

Any help will be greatly appreciated. Thanks

CodePudding user response:

The problem boils down to decreasing number of duplicates in dictionary. First iterating on dictionary is quite easy

new_dict = dict()
for key,value in outher_most_dictionary.items()
  new_cict[key] = dedup(value)

!!!Remember this will iterate over key, value pars in order of insertion!!! Now what does the function dedup will have to do?

  • Create new dictionary, let's say "tcid"
  • iterate over key value pars of passed dictionary
  • if value has never been "seen" add it to "tcid"
  • if value was already "seen" once add it to "tcid"
  • if value was already "seen" more than 2 times skip it
  • return "tcid"

CodePudding user response:

Here's a step-by-step approach:

import json

D = {
        "sphere": {
            1: "False",
            2: "False",
            3: "False",
            4: "True",
            5: "True",
            6: "False",
            7: "False",
            8: "False",
            9: "False"
        },
        "cube": {
            1: "True",
            2: "True",
            3: "False",
            4: "False",
            5: "False",
            6: "True",
            7: "True",
            8: "False",
            9: "False"
        },
        "torus": {
            1: "True",
            2: "True",
            3: "True",
            4: "False",
            5: "False",
            6: "False",
            7: "False",
            8: "True",
            9: "True"
        },
    }

def clean(v, dk):
    for k in dk[:-1]:
        del v[k]

def process(v):
    p = None
    dk = []
    for k in list(v.keys()):
        if v[k] == p:
            dk.append(k)
        else:
            clean(v, dk)
            p = v[k]
            dk = []
    clean(v, dk)


for v in D.values():
    process(v)

print(json.dumps(D, indent=2))
  • Related