**This is my python code, I'm trying to convert NGINX logs.
I'm reading logs from access.log file and using regular expressions to convert it into JSON format and i need to upload these logs to Elasticseach. Please also guide related to that. I'm new into both**
import json
import re
i = 0
result = {}
with open('access.log') as f:
lines = f.readlines()
regex = '([(\d\.)] ) - - \[(.*?)\] "(.*?)" (\d ) - "(.*?)" "(.*?)"'
for line in lines:
r = re.match(regex,line)
if len(r) >= 6:
result[i] = {'IP address': r[0], 'Time Stamp': r[1], 'HTTP status': r[2], 'Return status':
r[3], 'Browser Info': r[4]}
i = 1
print(result)
with open('data.json', 'w') as fp:
json.dump(result, fp)
I'm facing the following error
Traceback (most recent call last):
File "/home/zain/Downloads/stack.py", line 17, in <module>
if len(r) >= 6:
TypeError: object of type 'NoneType' has no len()
These are log format
127.0.0.1 - - [23/May/2022:22:44:14 -0400] "GET / HTTP/1.1" 200 3437 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
127.0.0.1 - - [23/May/2022:22:44:14 -0400] "GET /icons/openlogo-75.png HTTP/1.1" 404 125 "http://localhost/" "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
127.0.0.1 - - [23/May/2022:22:44:14 -0400] "GET /favicon.ico HTTP/1.1" 404 125 "http://localhost/" "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
Expected output is
IP Address: 127.0.0.1 Time Stamp: 23/May/2022:22:44:14 HTTP Status: "GET / HTTP/1.1" Return Status: 200 3437 Browser Info: "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
CodePudding user response:
I took my cue from this code. Believe the following should do it:
import json
import re
i = 0
result = {}
with open('access.log') as f:
lines = f.readlines()
regex = '(?P<ipaddress>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) - - \[(?P<dateandtime>.*)\] \"(?P<httpstatus>(GET|POST) . HTTP\/1\.1)\" (?P<returnstatus>\d{3} \d ) (\".*\")(?P<browserinfo>.*)\"'
for line in lines:
r = re.match(regex,line)
if r != None:
result[i] = {'IP address': r.group('ipaddress'), 'Time Stamp': r.group('dateandtime'),
'HTTP status': r.group('httpstatus'), 'Return status':
r.group('returnstatus'), 'Browser Info': r.group('browserinfo')}
i = 1
print(result)
with open('data.json', 'w') as fp:
json.dump(result, fp)
Result (print(json.dumps(result, sort_keys=False, indent=4))
):
{
"0": {
"IP address": "127.0.0.1",
"Time Stamp": "23/May/2022:22:44:14 -0400",
"HTTP status": "GET / HTTP/1.1",
"Return status": "200 3437",
"Browser Info": "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
},
"1": {
"IP address": "127.0.0.1",
"Time Stamp": "23/May/2022:22:44:14 -0400",
"HTTP status": "GET /icons/openlogo-75.png HTTP/1.1",
"Return status": "404 125",
"Browser Info": "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
},
"2": {
"IP address": "127.0.0.1",
"Time Stamp": "23/May/2022:22:44:14 -0400",
"HTTP status": "GET /favicon.ico HTTP/1.1",
"Return status": "404 125",
"Browser Info": "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
}
}