Home > Mobile >  count the amount of transactions that begin and end
count the amount of transactions that begin and end

Time:10-30

I have a this logfile:

19-3-2020 01:37:31.995  INFO    18  188 mailbox allocated for rsvp
19-3-2020 01:37:32.039  INFO    14  194 creating mailslot for dump
19-3-2020 01:37:32.082  INFO    18  194 out of INFO allcations
19-3-2020 01:37:32.119  INFO    18  188 creating mailslot for RSVP client API
19-3-2020 01:37:32.157  INFO    10  187 creating socket for traffic CONTROL module
19-3-2020 01:37:32.157  INFO    19  186 transaction 17327 begin
19-3-2020 01:37:32.276  INFO    11  188 loopback to avoid ERROR
19-3-2020 01:37:32.276  INFO    15  187 end transaction 17327
19-3-2020 01:37:32.314  INFO    13  189 creating mailslot for terminate

I need to count the amount of transactions that have a beginning and end

I tried using the defaultdic library but clearly its just the first lines cause im not sure what to do next:

from collections import defaultdict

transactions = defaultdict(dict)
with open('logfile.log', 'r') as f:

CodePudding user response:

You can do like following:

from collections import defaultdict

transactions = defaultdict(dict)
with open('logfile.log', 'r') as f:
    lines = f.readlines()


beginnings = set()
endings = set()
for line in lines:
    if "transaction" in line:
        if "begin" in line:
            transaction_id = line.split(" ")[-2]
            beginnings.add(int(transaction_id))
        elif "end" in line:
            transaction_id = line.split(" ")[-1]
            endings.add(int(transaction_id))


intersect = beginnings.intersection(endings)
print(intersect)

Here I created 2 different sets which holds unique elements. In the end of iteration, I used set's intersection method to find common elements between them

CodePudding user response:

Here's something that works (using log as a string, you can adapt it to get the lines of the log file directly; I've added a fake transaction for demonstration purpose):

Note that I assumed that no 2 transactions can begin with the same Id; if not, using sets will not work.

import re
log = '''
19-3-2020 01:37:31.995  INFO    18  188 mailbox allocated for rsvp
19-3-2020 01:37:32.039  INFO    14  194 creating mailslot for dump
19-3-2020 01:37:32.082  INFO    18  194 out of INFO allcations
19-3-2020 01:37:32.119  INFO    18  188 creating mailslot for RSVP client API
19-3-2020 01:37:32.157  INFO    10  187 creating socket for traffic CONTROL module
19-3-2020 01:37:32.157  INFO    19  186 transaction 17327 begin
19-3-2020 01:37:32.276  INFO    11  188 loopback to avoid ERROR
19-3-2020 01:37:32.276  INFO    15  187 end transaction 17327
19-3-2020 01:37:32.314  INFO    13  189 creating mailslot for terminate
19-3-2020 01:37:32.157  INFO    19  186 transaction 123456789 begin
'''
tr_begun = set()
tr_ended = set()
for line in log.splitlines():
    tr_id = re.search(r'(?<=transaction )[0-9] ', line)
    if tr_id:
        tr_id = tr_id[0]
        if 'begin' in line:
            tr_begun.add(tr_id)
        if 'end' in line:
            tr_ended.add(tr_id)
inter = tr_begun.intersection(tr_ended)
print(tr_begun, tr_ended, inter, len(inter))

# {'123456789', '17327'} {'17327'} {'17327'} 1  # The last number is what you want.
  • Related