Home > Blockchain >  How to filter and print particular json dictionaries in python
How to filter and print particular json dictionaries in python

Time:11-15

I'm in the process of learning python. I encountered a problem with json that I can't overcome.

I have this dataset from json in python:

{
    "Sophos": {
        "detected": true,
        "result": "phishing site"
    },
    "Phishtank": {
        "detected": false,
        "result": "clean site"
    },
    "CyberCrime": {
        "detected": false,
        "result": "clean site"
    },
    "Spam404": {
        "detected": false,
        "result": "clean site"
    },
    "SecureBrain": {
        "detected": false,
        "result": "clean site"
    },
    "Hoplite Industries": {
        "detected": false,
        "result": "clean site"
    },
    "CRDF": {
        "detected": false,
        "result": "clean site"
    },
    "Rising": {
        "detected": false,
        "result": "clean site"
    },
    "Fortinet": {
        "detected": true,
        "result": "phishing site"
    },
    "alphaMountain.ai": {
        "detected": true,
        "result": "phishing site"
    },
    "Lionic": {
        "detected": false,
        "result": "clean site"
    },
    "Cyble": {
        "detected": false,
        "result": "clean site"
    }
}

I would like to filter these dictionaries in such a way as to print only those keys and values in which "detected": true.

For example I would like print only

{
    "Sophos": {
        "detected": true,
        "result": "phishing site"
    },
    "Fortinet": {
        "detected": true,
        "result": "phishing site"
    }
}

I use VirusTotal apikey v2 https://developers.virustotal.com/v2.0/reference/domain-report My code in python:

parameters = {'apikey': api_key, 'resource': domain}

response = requests.get(url, params=parameters)
    
python_response = json.loads(response.text)

scans = python_response["scans"]

example = json.dumps(python_response["scans"], indent=4)

print(example)

I'm looking for a simple and readable way to do it so that I understand it as best I can. I would like print result in Python. I searched and read various solutions for this (list comprehension or filter() with lambda), but it did not help me.

I'm still learning, thanks in advance for your understanding if it's a simple case.

Thank you in advance for your help and answers.

CodePudding user response:

You can use dict comprehension to filter the response dictionary. Note that in the example you provided, I think you have json data and not the python object. true is not a valid boolean keyword in python, it should've been True instead.

filtered = {k: v for k, v in orignal_dict.items() if v.get("detected") == true}

For your example -

true = True
false = False

data = {
    "Sophos": {
        "detected": true,
        "result": "phishing site"
    },
    "Phishtank": {
        "detected": false,
        "result": "clean site"
    },
    "CyberCrime": {
        "detected": false,
        "result": "clean site"
    },
    "Spam404": {
        "detected": false,
        "result": "clean site"
    },
    "SecureBrain": {
        "detected": false,
        "result": "clean site"
    },
    "Hoplite Industries": {
        "detected": false,
        "result": "clean site"
    },
    "CRDF": {
        "detected": false,
        "result": "clean site"
    },
    "Rising": {
        "detected": false,
        "result": "clean site"
    },
    "Fortinet": {
        "detected": true,
        "result": "phishing site"
    },
    "alphaMountain.ai": {
        "detected": true,
        "result": "phishing site"
    },
    "Lionic": {
        "detected": false,
        "result": "clean site"
    },
    "Cyble": {
        "detected": false,
        "result": "clean site"
    }
}


filtered = {k: v for k, v in data.items() if v.get("detected") == true}
print(json.dumps(filtered, indent=4))

Output:

{
    "Sophos": {
        "detected": true,
        "result": "phishing site"
    },
    "Fortinet": {
        "detected": true,
        "result": "phishing site"
    },
    "alphaMountain.ai": {
        "detected": true,
        "result": "phishing site"
    }
}

CodePudding user response:

You can run a loop and then put a check

for key, value in example.items():
    if value["detected"] == True:
        print(key, value)

Alternatively, You can use the filter() function to achieve it.

list(filter(lambda x: x["detected"] == True, example.values()))

Output:

[{'detected': True, 'result': 'phishing site'}, {'detected': True, 'result': 'phishing site'}, {'detected': True, 'result': 'phishing site'}]

CodePudding user response:

An approach can be :

for i in jsondump:
    if jsondump[i]['detected'] == True:
        print(jsondump[i])

As if we loop in Jsondump using for each, it would result in giving all object names that hold data i.e

for i in jsondump:
     print(i)

The above code will result in :

Sophos  
Phishtank  
CyberCrime  
..  
..  
Fortinet  
alphaMountain.ai
Lionic  
Cyble  

Now if we have the keys, we can loop in using jsondump[i] and the value is stored in 'detected', so we will pass in jsondump[i]['detected'] to check if it is true.

CodePudding user response:

Over here we can use the dictionary comprehension

new_dict = { key:val for (key,val) in example.items() if(value["detected"] == True)}

a new list would be created with your desired condition that detected should be true

over here we iterate over the entire dictionary and then for each of the elements which is also a dictionary (val in this case ) we are checking if the value of the detected is True then only we add the entire dictionary to the new_dict

CodePudding user response:

You could use filter Python dictionary by value Using Generic Function with lambda, check the following example:

dataset = {
    "Sophos": {
        "detected": True,
        "result": "phishing site"
    },
    "Phishtank": {
        "detected": False,
        "result": "clean site"
    },
    "CyberCrime": {
        "detected": False,
        "result": "clean site"
    },
    "Spam404": {
        "detected": False,
        "result": "clean site"
    },
    "SecureBrain": {
        "detected": False,
        "result": "clean site"
    },
    "Hoplite Industries": {
        "detected": False,
        "result": "clean site"
    },
    "CRDF": {
        "detected": False,
        "result": "clean site"
    },
    "Rising": {
        "detected": False,
        "result": "clean site"
    },
    "Fortinet": {
        "detected": True,
        "result": "phishing site"
    },
    "alphaMountain.ai": {
        "detected": True,
        "result": "phishing site"
    },
    "Lionic": {
        "detected": False,
        "result": "clean site"
    },
    "Cyble": {
        "detected": False,
        "result": "clean site"
    }
}

def filter_dict(d, f):
    ''' Filters dictionary d by function f. '''
    newDict = dict()
    # Iterate over all (k,v) pairs in dict
    for key, value in d.items():
        # Is condition satisfied?
        if f(key, value):
            newDict[key] = value
    return newDict

print(filter_dict(dataset, lambda k, v: v['detected'] == True))

Output:

{'Sophos': {'detected': True, 'result': 'phishing site'}, 'Fortinet': {'detected': True, 'result': 'phishing site'}, 'alphaMountain.ai': {'detected': True, 'result': 'phishing site'}}
  • Related