Home > Software design >  python regex - how to get each group
python regex - how to get each group

Time:01-04

I'm trying to get each group starting with "BO_" using python regex. (The data was from: https://github.com/commaai/opendbc)

Original Text:

...
BS_:

BU_: XXX CAMERA FRONT_RADAR ADRV APRK


BO_ 53 ACCELERATOR: 32 XXX
 SG_ CHECKSUM : 0|16@1  (1,0) [0|65535] "" XXX
 SG_ COUNTER : 16|8@1  (1,0) [0|255] "" XXX
 SG_ GEAR : 192|3@1  (1,0) [0|7] "" XXX
 SG_ ACCELERATOR_PEDAL : 40|8@1  (1,0) [0|255] "" XXX

BO_ 64 GEAR_ALT: 32 XXX
 SG_ CHECKSUM : 0|16@1  (1,0) [0|65535] "" XXX
 SG_ COUNTER : 16|8@1  (1,0) [0|255] "" XXX
 SG_ GEAR : 32|3@1  (1,0) [0|7] "" XXX

BO_ 69 GEAR: 24 XXX
 SG_ CHECKSUM : 0|16@1  (1,0) [0|65535] "" XXX
 SG_ COUNTER : 16|8@1  (1,0) [0|255] "" XXX
 SG_ GEAR : 44|3@1  (1,0) [0|7] "" XXX

...
CM_ SG_ 96 BRAKE_PRESSURE "User applied brake pedal pressure. Ramps from computer applied pressure on falling edge of cruise. Cruise cancels if !=0";
CM_ SG_ 101 BRAKE_POSITION "User applied brake pedal position, max is ~700. Signed on some vehicles";
CM_ SG_ 373 PROBABLY_EQUIP "aeb equip?";

I want to capture BO_ blocks whose BO id(BO_ ID) is in [53, 69] like this:

BO_ 53 ACCELERATOR: 32 XXX
 SG_ CHECKSUM : 0|16@1  (1,0) [0|65535] "" XXX
 SG_ COUNTER : 16|8@1  (1,0) [0|255] "" XXX
 SG_ GEAR : 192|3@1  (1,0) [0|7] "" XXX
 SG_ ACCELERATOR_PEDAL : 40|8@1  (1,0) [0|255] "" XXX

BO_ 69 GEAR: 24 XXX
 SG_ CHECKSUM : 0|16@1  (1,0) [0|65535] "" XXX
 SG_ COUNTER : 16|8@1  (1,0) [0|255] "" XXX
 SG_ GEAR : 44|3@1  (1,0) [0|7] "" XXX

Any suggestions? Thank you very much.

What I've tried so far was 1)capturing BO_ and relevant SGs using the regex below but it only captured each BO and the first SG groups.

BO_ (\w ) (\w ) *: (\w ) (\w )\n (SG_ (\w ) : (\d )\|(\d )@(\d )([\ |\-]) \(([0-9. \-eE] ),([0-9. \-eE] )\) \[([0-9. \-eE] )\|([0-9. \-eE] )\] \"(.*)\" (.*)\n)*
  1. using greedy method but it captured all BOs at once except the last occurence.
BO_ (\w ) (\w ) *: (\w ) (\w )((.|\n)*)BO_ 

Also, for selecting only BOs including digits in the list [53, 69], I used raw f-string method something like rf"{digit}" in regex expressions.

CodePudding user response:

Inside of jumping through hoops in order to parse a dbc file with regular expressions, I suggest you use a proper parser like cantools:

CAN BUS tools in Python 3.

  • DBC, KCD, SYM, ARXML 3&4 and CDD file parsing.
  • CAN message encoding and decoding.
  • Simple and extended signal multiplexing.
  • Diagnostic DID encoding and decoding.
  • candump output decoder.
  • Node tester_.
  • C source code generator.
  • CAN bus monitor.
  • Graphical plots of signals.

CodePudding user response:

You can easily capture paragraphs with re.findall using the re.DOTALL (inline s) and re.MULTILINE (inline m) flags.

Regex (with inline flags): (?sm)BO_ (?:53|69) . ?^$

Usage (pick one):

re.findall(r"(?sm)BO_ (?:53|69) . ?^$", text)
re.findall(r"BO_ (?:53|69) . ?^$", text, flags=re.DOTALL | re.MULTILINE)

This lazily captures all lines from BO_ 53 or BO_ 69 to a blank line ^$ (demo).

  • Related