I have the following input file with the format as below.
SOME TEXT SOME TEXT SOME TEXT
SOME TEXT
RETCODE = 0
A = 1
B = 5
C = 3
D = 0
E = 0
D = 4
E = 1
G = 1
blah blah
--- ENDBLOCK
SOME TEXT
RETCODE = 0
A = 3
B = 2
C = 8
D = 6
E = 9
F = 3
blah blah
blah blah
--- ENDBLOCK
SOME TEXT
RETCODE = 0
A = 7
B = 2
C = 2
D = 9
E = 0
D = 1
E = 4
D = 7
E = 0
F = 1
D = 1
G = 8
blah blah
blah blah
--- ENDBLOCK
It shows data in blocks. Each block begins with A= some value
and the block ends with
a line saying --- ENDBLOCK
. Each block has sub block
and each sub block begins after a blank line.
For example, for first block the first sub block A=1, B=5, C=3, second sub block is D=0, E=0 and 3rd sub block is D=4, E=1, G=1.
My goal is kind of complex and is to get a list showing the values for A to G for each sub block. If there is no value, show empty. Desired output below.
[
[
[1,5,3,,,,],[,,,0,0,,],[,,,4,1,,1]
],
[
[3,2,8,6,9,3,]
],
[
[7,2,2,9,0,,],[,,,1,4,,],[,,,7,0,1,],[,,,1,,,8]
]
]
My current code is reading the file and storing the data at some level but still I'm far from the desired output. Thanks for any help.
f=open("file.txt","r").read().split("ENDBLOCK")
rows = []
for line in f:
for subline in line.split("\n\n"):
if " = " in subline and not "RETCODE" in subline:
rows.append(subline.replace(" ","").split())
>>> rows
[
['A=1', 'B=5', 'C=3'], ['D=0', 'E=0'], ['D=4', 'E=1', 'G=1'],
['D=1', 'E=4'], ['D=7', 'E=0', 'F=1'], ['D=1', 'G=8']
]
CodePudding user response:
You can do something like this:
import re
BLOCK_END = "--- ENDBLOCK"
LETTER_TO_VALUE_REGEX = re.compile(r"\s (?P<letter>[A-G]) = (?P<number>\d)*")
LETTERS = ["A", "B", "C", "D", "E", "F", "G"]
SUB_BLOCK_END = "\n\n"
top_list = []
for block in text.split(BLOCK_END):
if not block:
continue
sub_list = []
for sub_block in block.split(SUB_BLOCK_END):
if not sub_block:
continue
letters_to_numbers_dict = {}
for line in sub_block.split("\n"):
match = LETTER_TO_VALUE_REGEX.match(line)
if match:
letters_to_numbers_dict[match["letter"]] = int(match["number"])
inner_list = []
for letter in LETTERS:
inner_list.append(letters_to_numbers_dict.get(letter, ""))
if set(inner_list) not in [{""}]:
sub_list.append(inner_list)
top_list.append(sub_list)
and then "top_list" is your desired list
CodePudding user response:
check out this solution
from copy import deepcopy
f = open("file.txt", "r").readlines()
rows = []
second_row = []
third_row = []
for line in f:
if 'RETCODE' not in line and '=' in line:
number = [ int(x) for x in line.split('=')[1] if x.isnumeric()][0]
third_row.append(number)
elif line == "\n":
third_copy = deepcopy(third_row)
if third_copy:
second_row.append(third_copy)
third_row = []
elif 'ENDBLOCK' in line:
second_copy = deepcopy(second_row)
if second_copy:
rows.append(second_copy)
second_row = []
print(rows)