How to get each group-CodePudding

I'm trying to get each group starting with "BO_" using python regex. (The data was from: https://github.com/commaai/opendbc)

Original Text:

...
BS_:

BU_: XXX CAMERA FRONT_RADAR ADRV APRK


BO_ 53 ACCELERATOR: 32 XXX
 SG_ CHECKSUM : 0|16@1  (1,0) [0|65535] "" XXX
 SG_ COUNTER : 16|8@1  (1,0) [0|255] "" XXX
 SG_ GEAR : 192|3@1  (1,0) [0|7] "" XXX
 SG_ ACCELERATOR_PEDAL : 40|8@1  (1,0) [0|255] "" XXX

BO_ 64 GEAR_ALT: 32 XXX
 SG_ CHECKSUM : 0|16@1  (1,0) [0|65535] "" XXX
 SG_ COUNTER : 16|8@1  (1,0) [0|255] "" XXX
 SG_ GEAR : 32|3@1  (1,0) [0|7] "" XXX

BO_ 69 GEAR: 24 XXX
 SG_ CHECKSUM : 0|16@1  (1,0) [0|65535] "" XXX
 SG_ COUNTER : 16|8@1  (1,0) [0|255] "" XXX
 SG_ GEAR : 44|3@1  (1,0) [0|7] "" XXX

...
CM_ SG_ 96 BRAKE_PRESSURE "User applied brake pedal pressure. Ramps from computer applied pressure on falling edge of cruise. Cruise cancels if !=0";
CM_ SG_ 101 BRAKE_POSITION "User applied brake pedal position, max is ~700. Signed on some vehicles";
CM_ SG_ 373 PROBABLY_EQUIP "aeb equip?";

I want to capture BO_ blocks whose BO id(BO_ ID) is in [53, 69] like this:

BO_ 53 ACCELERATOR: 32 XXX
 SG_ CHECKSUM : 0|16@1  (1,0) [0|65535] "" XXX
 SG_ COUNTER : 16|8@1  (1,0) [0|255] "" XXX
 SG_ GEAR : 192|3@1  (1,0) [0|7] "" XXX
 SG_ ACCELERATOR_PEDAL : 40|8@1  (1,0) [0|255] "" XXX

BO_ 69 GEAR: 24 XXX
 SG_ CHECKSUM : 0|16@1  (1,0) [0|65535] "" XXX
 SG_ COUNTER : 16|8@1  (1,0) [0|255] "" XXX
 SG_ GEAR : 44|3@1  (1,0) [0|7] "" XXX

What I've tried so far was 1)capturing BO_ and relevant SGs using the regex below but it only captured each BO and the first SG groups.

BO_ (\w ) (\w ) *: (\w ) (\w )\n (SG_ (\w ) : (\d )\|(\d )@(\d )([\ |\-]) \(([0-9. \-eE] ),([0-9. \-eE] )\) \[([0-9. \-eE] )\|([0-9. \-eE] )\] \"(.*)\" (.*)\n)*

using greedy method but it captured all BOs at once except the last occurence.

BO_ (\w ) (\w ) *: (\w ) (\w )((.|\n)*)BO_

Also, for selecting only BOs including digits in the list [53, 69], I used raw f-string method something like rf"{digit}" in regex expressions.

CodePudding user response：

You can easily capture paragraphs with re.findall using the re.DOTALL (inline s) and re.MULTILINE (inline m) flags.

Regex (with inline flags): (?sm)BO_ (?:53|69) . ?^$

Usage (pick one):

re.findall(r"(?sm)BO_ (?:53|69) . ?^$", text)

re.findall(r"BO_ (?:53|69) . ?^$", text, flags=re.DOTALL | re.MULTILINE)

This lazily captures all lines from BO_ 53 or BO_ 69 to a blank line ^$ (demo).

CodePudding user response：

Inside of jumping through hoops in order to parse a dbc file with regular expressions, I suggest you use a proper parser like cantools:

CAN BUS tools in Python 3.

DBC, KCD, SYM, ARXML 3&4 and CDD file parsing.

CAN message encoding and decoding.

Simple and extended signal multiplexing.

Diagnostic DID encoding and decoding.

candump output decoder.

Node tester_.

C source code generator.

CAN bus monitor.

Graphical plots of signals.