I'm by far not a Python3 hero, but focussed on learning some new skills with it, thus any help would be appreciated. Working on a personal project that I want to throw on GitHub later on, I run into having a command outputting the following Python dictionary:
{'masscan': {'command_line': 'masscan -oX - 192.168.0.131/24 -p 22,80 --max-rate=1000', 'scanstats': {'timestr': '2022-03-26 10:00:07', 'elapsed': '12', 'uphosts': '2', 'downhosts': '0', 'totalhosts': '2'}}, 'scan': {'192.168.0.254': {'tcp': {80: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}, 22: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}}}}}
I then want to parse that to the following JSON format:
{
"data": [
{
"{#PORT}": 80,
"{#STATE}": "OPEN",
"{#ENDTIME}": "1648285195"
},
{
"{#PORT}": 22,
"{#STATE}": "Interface #2",
"{#ENDTIME}": "1648285195"
}
]
}
What would be the most efficient way to parse through it? I don't want it to end up in a file but keep it within my code preferrably. Keeping in mind that there might be more ports than just port 22 and 80. The dictionary might be a lot longer, but following the same format.
Thanks!
CodePudding user response:
this function will return exactly what you want (i suppose):
def parse_data(input):
data = []
for ip in input['scan'].keys():
for protocol in input['scan'][ip].keys():
for port in input['scan'][ip][protocol].keys():
port_data = {"{#PORT}": port, "{#STATE}": input['scan'][ip][protocol][port]['state'].upper(), "{#ENDTIME}": input['scan'][ip][protocol][port]['endtime']}
data.append(port_data)
return {'data': data}
function returns (ouput):
{
"data":[
{
"{#PORT}":80,
"{#STATE}":"OPEN",
"{#ENDTIME}":"1648285195"
},
{
"{#PORT}":22,
"{#STATE}":"OPEN",
"{#ENDTIME}":"1648285195"
}
]
}
don't know where 'Interface #2' in port '22' 'state' came from (in your desired result).
CodePudding user response:
Possible solution is the following:
log_data = {'masscan': {'command_line': 'masscan -oX - 192.168.0.131/24 -p 22,80 --max-rate=1000', 'scanstats': {'timestr': '2022-03-26 10:00:07', 'elapsed': '12', 'uphosts': '2', 'downhosts': '0', 'totalhosts': '2'}}, 'scan': {'192.168.0.254': {'tcp': {80: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}, 22: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}}}}}
result = {"data": []}
for k, v in dct['scan'].items():
for tcp, tcp_data in v.items():
for port, port_data in tcp_data.items():
data = {"{#PORT}": port, "{#STATE}": port_data['state'], "{#ENDTIME}": port_data['endtime']}
result["data"].append(data)
print(result)
Prints
{'data': [
{'{#PORT}': 80, '{#STATE}': 'open', '{#ENDTIME}': '1648285195'},
{'{#PORT}': 22, '{#STATE}': 'open', '{#ENDTIME}': '1648285195'}]}
CodePudding user response:
You could do a recursive search for the 'tcp' key and go from there. Something like this:
mydict = {'masscan': {'command_line': 'masscan -oX - 192.168.0.131/24 -p 22,80 --max-rate=1000', 'scanstats': {'timestr': '2022-03-26 10:00:07', 'elapsed': '12', 'uphosts': '2', 'downhosts': '0', 'totalhosts': '2'}},
'scan': {'192.168.0.254': {'tcp': {80: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}, 22: {'state': 'open', 'reason': 'syn-ack', 'reason_ttl': '64', 'endtime': '1648285195', 'services': []}}}}}
def findkey(d, k):
if k in d:
return d[k]
for v in d.values():
if isinstance(v, dict):
if r := findkey(v, k):
return r
rdict = {'data': []}
for k, v in findkey(mydict, 'tcp').items():
rdict['data'].append(
{'{#PORT}': k, '{#STATE}': v['state'].upper(), '{#ENDTIME}': v['endtime']})
print(rdict)
Output:
{'data': [{'{#PORT}': 80, '{#STATE}': 'OPEN', '{#ENDTIME}': '1648285195'}, {'{#PORT}': 22, '{#STATE}': 'OPEN', '{#ENDTIME}': '1648285195'}]}