I am fairly new to Python programming and I am creating an Apache log parser. I came around a Counter function from 'import collections'. I am trying to reduce the amount of lines, because currently I am counting my IP occurances like this:
if sort == 'ip':
ip_count = []
for match in ip_list:
count = 0
for ip_match in ip:
if match == ip_match:
count = 1
ip_count.append(count)
and my bytes like this:
if desired_output == 'bytes':
cnt_bytes = []
for v in range(len(ip_list)):
tmp = 0
for k in range(len(ip)):
if ip_list[v] == ip[k]:
if bytes[k] == '-':
bytes[k] = 0
tmp = int(bytes[k])
cnt_bytes.append(tmp)
It seems unpythonic.
ip_list[] is a list of unique ip addresses. ip_count[] stores the count for each ip address.
Is there a way to reduce these lines of code with Counter() function?
CodePudding user response:
You can use Counter
:
from collections import Counter
with open('access.log') as fp:
ips = []
for row in fp:
ips.append(row.split(maxsplit=1)[0])
counter = Counter(ips)
Or defaultdict
:
from collections import defaultdict
with open('access.log') as fp:
counter = defaultdict(int)
for row in fp:
counter[row.split(maxsplit=1)[0]] = 1