Home > database >  Parsing Logs with Regular Expressions Python
Parsing Logs with Regular Expressions Python

Time:03-19

Coding and Python lightweight :)

I've gotta iterate through some logfiles and pick out the ones that say ERROR. Boom done got that. What I've gotta do is figure out how to grab the following 10 lines containing the details of the error. Its gotta be some combo of an if statement and a for/while loop I presume. Any help would be appreciated.

import os
import re

# Regex used to match
line_regex = re.compile(r"ERROR")

# Output file, where the matched loglines will be copied to
output_filename = os.path.normpath("NodeOut.log")
# Overwrites the file, ensure we're starting out with a blank file
#TODO Append this later
with open(output_filename, "w") as out_file:
    out_file.write("")

# Open output file in 'append' mode
with open(output_filename, "a") as out_file:
    # Open input file in 'read' mode
    with open("MXNode1.stdout", "r") as in_file:
        # Loop over each log line
        for line in in_file:
            # If log line matches our regex, print remove later, and write > file
            if (line_regex.search(line)):
                # for i in range():
                print(line)
                out_file.write(line)

CodePudding user response:

There is no need for regex to do this, you can just use the in operator ("ERROR" in line).

Also, to clear the content of the file without opening it in w mode, you can simply place the cursor at the beginning of the file and truncate.

import os

output_filename = os.path.normpath("NodeOut.log")

with open(output_filename, 'a') as out_file:
    out_file.seek(0, 0)
    out_file.truncate(0)

    with open("MXNode1.stdout", 'r') as in_file:
        line = in_file.readline()
        while line:
            if "ERROR" in line:
                out_file.write(line)
                for i in range(10):
                    out_file.write(in_file.readline())
            line = in_file.readline()

We use a while loop to read lines one by one using in_file.readline(). The advantage is that you can easily read the next line using a for loop.

See the doc:

f.readline() reads a single line from the file; a newline character (\n) is left at the end of the string, and is only omitted on the last line of the file if the file doesn’t end in a newline. This makes the return value unambiguous; if f.readline() returns an empty string, the end of the file has been reached, while a blank line is represented by '\n', a string containing only a single newline.

CodePudding user response:

Assuming you would only want to always grab the next 10 lines, then you could do something similar to:

    with open("MXNode1.stdout", "r") as in_file:
        # Loop over each log line
         lineCount = 11
         for line in in_file:
             # If log line matches our regex, print remove later, and write > file
             if (line_regex.search(line)):
                 # for i in range():
                 print(line)
                 lineCount = 0
             if (lineCount < 11):
                 lineCount  = 1
                 out_file.write(line)

The second if statement will help you always grab the line. The magic number of 11 is so that you grab the next 10 lines after the initial line that the ERROR was found on.

  • Related