Home > Back-end >  How would you convert the following dictionary to a specific JSON structure?
How would you convert the following dictionary to a specific JSON structure?

Time:03-26

I'm by far not a Python3 hero, but focussed on learning some new skills with it, thus any help would be appreciated. Working on a personal project that I want to throw on GitHub later on, I run into having a command outputting the following Python dictionary:

{'masscan': {'command_line': 'masscan -oX - 192.168.0.131/24 -p 22,80 --max-rate=1000', 'scanstats': {'timestr': '2022-03-26 10:00:07', 'elapsed': '12', 'uphosts': '2', 'downhosts': '0', 'totalhosts': '2'}}, 'scan': {'192.168.0.254': {'tcp': {80: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}, 22: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}}}}}

I then want to parse that to the following JSON format:

{
"data": [
    {
        "{#PORT}": 80,
        "{#STATE}": "OPEN",
        "{#ENDTIME}": "1648285195"
    },
    {
        "{#PORT}": 22,
        "{#STATE}": "Interface #2",
        "{#ENDTIME}": "1648285195"
    }
]
}  

What would be the most efficient way to parse through it? I don't want it to end up in a file but keep it within my code preferrably. Keeping in mind that there might be more ports than just port 22 and 80. The dictionary might be a lot longer, but following the same format.

Thanks!

CodePudding user response:

this function will return exactly what you want (i suppose):

def parse_data(input):
    data = []
    for ip in input['scan'].keys():
        for protocol in input['scan'][ip].keys():
            for port in input['scan'][ip][protocol].keys():
                port_data = {"{#PORT}": port, "{#STATE}": input['scan'][ip][protocol][port]['state'].upper(), "{#ENDTIME}": input['scan'][ip][protocol][port]['endtime']}
                data.append(port_data)
    return {'data': data} 

function returns (ouput):

    {
   "data":[
      {
         "{#PORT}":80,
         "{#STATE}":"OPEN",
         "{#ENDTIME}":"1648285195"
      },
      {
         "{#PORT}":22,
         "{#STATE}":"OPEN",
         "{#ENDTIME}":"1648285195"
      }
   ]
}

don't know where 'Interface #2' in port '22' 'state' came from (in your desired result).

CodePudding user response:

Possible solution is the following:

log_data = {'masscan': {'command_line': 'masscan -oX - 192.168.0.131/24 -p 22,80 --max-rate=1000', 'scanstats': {'timestr': '2022-03-26 10:00:07', 'elapsed': '12', 'uphosts': '2', 'downhosts': '0', 'totalhosts': '2'}}, 'scan': {'192.168.0.254': {'tcp': {80: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}, 22: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}}}}}

result = {"data": []}

for k, v in dct['scan'].items():
    for tcp, tcp_data in v.items():
        for port, port_data in tcp_data.items():
            data = {"{#PORT}": port, "{#STATE}": port_data['state'], "{#ENDTIME}": port_data['endtime']}
            result["data"].append(data)
            
print(result)

Prints

{'data': [
    {'{#PORT}': 80, '{#STATE}': 'open', '{#ENDTIME}': '1648285195'},
    {'{#PORT}': 22, '{#STATE}': 'open', '{#ENDTIME}': '1648285195'}]}

CodePudding user response:

You could do a recursive search for the 'tcp' key and go from there. Something like this:

mydict = {'masscan': {'command_line': 'masscan -oX - 192.168.0.131/24 -p 22,80 --max-rate=1000', 'scanstats': {'timestr': '2022-03-26 10:00:07', 'elapsed': '12', 'uphosts': '2', 'downhosts': '0', 'totalhosts': '2'}},
          'scan': {'192.168.0.254': {'tcp': {80: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}, 22: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}}}}}


def findkey(d, k):
    if k in d:
        return d[k]
    for v in d.values():
        if isinstance(v, dict):
            if r := findkey(v, k):
                return r


rdict = {'data': []}
for k, v in findkey(mydict, 'tcp').items():
    rdict['data'].append(
        {'{#PORT}': k, '{#STATE}': v['state'].upper(), '{#ENDTIME}': v['endtime']})


print(rdict)

Output:

{'data': [{'{#PORT}': 80, '{#STATE}': 'OPEN', '{#ENDTIME}': '1648285195'}, {'{#PORT}': 22, '{#STATE}': 'OPEN', '{#ENDTIME}': '1648285195'}]}
  • Related