Home > database >  Extract all lines after word
Extract all lines after word

Time:03-17

I want to extract all lines that will be printed after this marker: PossibleErrs

s1 = (
 ' DIMM Error Summary\r\n'
 ' Skt  Chan Dimm Slot    DimmSN   KnownErrs PossibleErrs\r\n'
 '    0    5    0 DIMM_F1 XD12F         0            2\r\n'
 '    0    5    1 DIMM_F2 XD12C         0            2\r\n')

My attempt is insufficient because it just returns a single line, when there could be more than one line to process. It also returns the linebreaks \r\n.

re.findall(r'PossibleErrs(\s*.*) [^\r\n]', s1)

How can I parse this text to just return the values found after the marker PossibleErrs?

Expected:

0, 5, 0, DIMM_F1, XD12F, 0, 2,
0, 5, 1, DIMM_F2, XD12C, 0, 2

CodePudding user response:

You don't need a regular expression to do this: you just need to split on the CRLFs (\r\ns) to get a list of lines (slicing off the first two lines and the last line), and then split each line on whitespace:

lines = s1.split("\r\n")[2:-1]
result = [line.split() for line in lines]
print(result)

This outputs:

[['0', '5', '0', 'DIMM_F1', 'XD12F', '0', '2'], ['0', '5', '1', 'DIMM_F2', 'XD12C', '0', '2']]

from which you can easily massage it into any desired output format.

  • Related