Home > Back-end >  How do I read a log files where separation between two objects is nothing?
How do I read a log files where separation between two objects is nothing?

Time:03-27

I have a log files which I want to read to a dataframe, but there is no separator between two objects.

Country|ID|Item_IDCountry|ID|Item_IDCountry|ID|Item_IDCountry|ID|Item_ID

it is in this format where Country is strictly a 2 char string.

I'm trying to figure out how to do it in python as i'm still a beginner. Any help would be much appreciated

I tried read_csv but that was a fail, I tried to look for answers online but didnt find much

CodePudding user response:

The seperator in that format is |, and assuming the name of the log file in question is logs.csv:

import pandas

logs = []
with open("logs.csv") as f:
    lines = f.readlines()
    column_names = lines[0].rstrip('\n').split("|")
    for l in lines[1:]:
        logs.append(l.rstrip('\n').split("|"))

df = pandas.DataFrame(logs, columns=column_names)
print(df)

lines[0].rstrip('\n').split("|") basically removes the new-line character from the first line and turns the column names (Country|ID|Item_IDCountry|ID|Item_IDCountry|ID|Item_IDCountry|ID|Item_ID) into a list.

for l in lines[1:]: iterates over all lines in the logs file, starting for the second line.

  • Related