Home > Net >  how to remove duplicate entries from json file using python?
how to remove duplicate entries from json file using python?

Time:12-01

How to remove duplicate entries from a JSON file using python?

I have a JSON file that looks like this:

appreciate some one can help to provide a solution for fixing it

json_data = [
    {
        "authType": "ldap",
        "password": "",
        "permissions": [
            {
                "collections": [
                    "aks9099",
                    "aks9099",
                    "aks9098",
                    "aks9100",
                    "aks9100",
                    "aks9101",
                    "aks9102",
                    "aks9103",
                    "aks9103"
                    
                ],
                "project": "Central Project"
            }
        ],
        "role": "devSecOps",
        "username": "[email protected]"
    },
    {
        "authType": "ldap",
        "password": "",
        "permissions": [
            {
                "collections": [
                    "aks9099",
                    "aks9098",
                    "aks9098",
                    "aks9100",
                    "aks9101",
                    "aks9102",
                    "aks9102",
                    "aks9103"
                ],
                "project": "Central Project"
            }
        ],
        "role": "devSecOps",
        "username": "[email protected]"
    },
    {
        "authType": "ldap",
        "password": "",
        "permissions": [
            {
                "collections": [
                    "aks9099",
                    "aks9098",
                    "aks9100",
                    "aks9100",
                    "aks9101",
                    "aks9102",
                    "aks9102",
                    "aks9103"
                ],
                "project": "Central Project"
            }
        ],
        "role": "devSecOps",
        "username": "[email protected]"
    }
]

I would like to remove duplicate entries from the list and expected result should be looks like this:

Appreciate you can help to provide a solution for fixing it

json_data = [
    {
        "authType": "ldap",
        "password": "",
        "permissions": [
            {
                "collections": [
                    "aks9099",
                    "aks9098",
                    "aks9100",
                    "aks9101",
                    "aks9102",
                    "aks9103"
                    
                ],
                "project": "Central Project"
            }
        ],
        "role": "devSecOps",
        "username": "[email protected]"
    },
    {
        "authType": "ldap",
        "password": "",
        "permissions": [
            {
                "collections": [
                    "aks9099",
                    "aks9098",
                    "aks9100",
                    "aks9101",
                    "aks9102",
                    "aks9103"
                ],
                "project": "Central Project"
            }
        ],
        "role": "devSecOps",
        "username": "[email protected]"
    },
    {
        "authType": "ldap",
        "password": "",
        "permissions": [
            {
                "collections": [
                    "aks9099",
                    "aks9098",
                    "aks9100",
                    "aks9101",
                    "aks9102",
                    "aks9103"
                ],
                "project": "Central Project"
            }
        ],
        "role": "devSecOps",
        "username": "[email protected]"
    }
]

CodePudding user response:

Does the following solve your problem?

new_list=[]
for i in json_data:
    if not i in new_list:
        new_list.append(i)
print(new_list)

CodePudding user response:

Even though OP asked to do this in Python, this can readily be done in jq using function unique with a single update assignment:

$ jq '.[].permissions[].collections |= unique' json.txt 
[
  {
    "authType": "ldap",
    "password": "",
    "permissions": [
      {
        "collections": [
          "aks9098",
          "aks9099",
          "aks9100",
          "aks9101",
          "aks9102",
          "aks9103"
        ],
        "project": "Central Project"
      }
    ],
    "role": "devSecOps",
    "username": "[email protected]"
  },
  {
    "authType": "ldap",
    "password": "",
    "permissions": [
      {
        "collections": [
          "aks9098",
          "aks9099",
          "aks9100",
          "aks9101",
          "aks9102",
          "aks9103"
        ],
        "project": "Central Project"
      }
    ],
    "role": "devSecOps",
    "username": "[email protected]"
  },
  {
    "authType": "ldap",
    "password": "",
    "permissions": [
      {
        "collections": [
          "aks9098",
          "aks9099",
          "aks9100",
          "aks9101",
          "aks9102",
          "aks9103"
        ],
        "project": "Central Project"
      }
    ],
    "role": "devSecOps",
    "username": "[email protected]"
  }
]

To invoke this in Python, one could do this:

import subprocess
return_obj = subprocess.run(["jq", ".[].permissions[].collections |= unique","json.txt"], stdout=subprocess.PIPE)
json_data = return_obj.stdout.decode()
  • Related