Home > Net >  Read document form keyword to keyword and split data into array with python
Read document form keyword to keyword and split data into array with python

Time:03-27

I am trying to read a file in python from one keyword to a second keyword. After that, I would like to split the data into an array by columns.

My data looks something like this:

One Two Three
KEYWORD1
124   129  134
245   345  345
356   357  356
356   354  355
KEYWORD2
Four Five Six

What my array should look like:

array1 = [124, 245, 356, 356]
array2 = [129, 345, 357, 354]
array3 = [134, 345, 356, 355]

How would I implement this feature with python? Thanks in advance!

CodePudding user response:

Here is one way to do so:

array1 = []
array2 = []
array3 = []
with open("filename", "r") as f:
    while f.readline().strip() != "KEYWORD1":
        pass
    while (line := f.readline().strip()) != "KEYWORD2":
        nums = line.split()
        array1.append(int(nums[0]))
        array2.append(int(nums[1]))
        array3.append(int(nums[2]))

  • We use f.readline() to read the lines of the file one by one. As long as the line is different from KEYWORD1, we pass.
  • As long as the line is different from KEYWORD2, we cut the line along the whitespaces, and we add each element in the corresponding list after having converted them into Int.

The advantage is that by doing so, the entire file is not loaded into memory.


Note:

  • := is the Walrus operator, used to assign line to f.readline().strip() within an expression.

CodePudding user response:

First grab the relevant data with a regex: https://regex101.com/r/yMwJ0l/1

Convert the data to whatever format you like

CodePudding user response:

Here's yet another approach:

list1, list2, list3 = [], [], []
lol = [list1, list2, list3]
with open('foo.txt') as foo:
    m = map(str.strip, foo)
    while next(m) != 'KEYWORD1':
        pass
    while (line := next(m)) != 'KEYWORD2':
        for v, a in zip(line.split(), lol):
            a.append(int(v))
for lst in lol:
    print(lst)

Output:

[124, 245, 356, 356]
[129, 345, 357, 354]
[134, 345, 356, 355]
  • Related