146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
197.109.77.178 - kertzmann3129 [21/Jun/2019:15:45:25 -0700] "DELETE /virtual/solutions/target/web services HTTP/2.0" 203 26554
156.127.178.177 - okuneva5222 [21/Jun/2019:15:45:27 -0700] "DELETE /interactive/transparent/niches/revolutionize HTTP/1.1" 416 14701
100.32.205.59 - ortiz8891 [21/Jun/2019:15:45:28 -0700] "PATCH /architectures HTTP/1.0" 204 6048
All I want is to convert the above data into a list of dictionaries, where each dictionary looks like the following:
example_dict = {"host":"146.204.224.152",
"user_name":"feest6811",
"time":"21/Jun/2019:15:45:24 -0700",
"request":"POST /incentivize HTTP/1.1"}
kindly help me i am new!!
CodePudding user response:
You could use
^
(?P<host>\d \S )[-\s]
(?P<user_name>\S )\s
\[(?P<time>[^][] )\]\s
"(?P<request>[^"] )"
In Python
this could be
import re
pattern = re.compile(r"""
^
(?P<host>\d \S )[-\s]
(?P<user_name>\S )\s
\[(?P<time>[^][] )\]\s
"(?P<request>[^"] )"
""", re.MULTILINE | re.VERBOSE)
data = """
146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
197.109.77.178 - kertzmann3129 [21/Jun/2019:15:45:25 -0700] "DELETE /virtual/solutions/target/web services HTTP/2.0" 203 26554
156.127.178.177 - okuneva5222 [21/Jun/2019:15:45:27 -0700] "DELETE /interactive/transparent/niches/revolutionize HTTP/1.1" 416 14701
100.32.205.59 - ortiz8891 [21/Jun/2019:15:45:28 -0700] "PATCH /architectures HTTP/1.0" 204 6048
"""
for match in pattern.finditer(data):
dct = match.groupdict()
print(dct)
And would yield
{'host': '146.204.224.152', 'user_name': 'feest6811', 'time': '21/Jun/2019:15:45:24 -0700', 'request': 'POST /incentivize HTTP/1.1'}
{'host': '197.109.77.178', 'user_name': 'kertzmann3129', 'time': '21/Jun/2019:15:45:25 -0700', 'request': 'DELETE /virtual/solutions/target/web services HTTP/2.0'}
{'host': '156.127.178.177', 'user_name': 'okuneva5222', 'time': '21/Jun/2019:15:45:27 -0700', 'request': 'DELETE /interactive/transparent/niches/revolutionize HTTP/1.1'}
{'host': '100.32.205.59', 'user_name': 'ortiz8891', 'time': '21/Jun/2019:15:45:28 -0700', 'request': 'PATCH /architectures HTTP/1.0'}
CodePudding user response:
in this code i'm using re to search patterns, then gathering matches in the dictionary unit_d. List fulllist contains all dictionaries.
import re
filename='c:/test/log.txt'
fulllist=[]
with open(filename) as file:
for line in file:
unit_d=dict()
text=line.rstrip()
finder=re.search('([\d\.] )[\s-] (\w ) \[([\w/: -] )\] "([^"] )',text)
unit_d['host']=finder.group(1)
unit_d['user_name']=finder.group(2)
unit_d['time']=finder.group(3)
unit_d['request']=finder.group(4)
print unit_d
fulllist.append(unit_d)
results
{'request': 'POST /incentivize HTTP/1.1', 'host': '146.204.224.152', 'user_name': 'feest6811', 'time': '21/Jun/2019:15:45:24 -0700'}
{'request': 'DELETE /virtual/solutions/target/web services HTTP/2.0', 'host': '197.109.77.178', 'user_name': 'kertzmann3129', 'time': '21/Jun/2019:15:45:25 -0700'}
{'request': 'DELETE /interactive/transparent/niches/revolutionize HTTP/1.1', 'host': '156.127.178.177', 'user_name': 'okuneva5222', 'time': '21/Jun/2019:15:45:27 -0700'}
{'request': 'PATCH /architectures HTTP/1.0', 'host': '100.32.205.59', 'user_name': 'ortiz8891', 'time': '21/Jun/2019:15:45:28 -0700'}