Home > Mobile >  Extract all caps word contained in ANSI colored text
Extract all caps word contained in ANSI colored text

Time:03-17

How might I extract this all caps word contained in regex with ANSI code for colored text in the terminal?

Example:

s1 = '      Elapsed: 0:00:59.694 - Elapsed/GB: 0:00:00.125 - Result: \x1b[92mPASS\x1b[0m\r\n'

My Failures:

re.findall(r'- Result: [^\x1b[92m\x1b[0m\r\n]', s1)
re.findall(r'- Result: ([A-Z] )', s1)

Expected:

PASS

CodePudding user response:

Try this:

re.findall("\x1b\\[.*?m([A-Z0-9] ?)\x1b\\[", a)

So, first, if it will be the only colored thing in the line, then, start from the ANSI code itself. Perceive that I did not prefixed the pattern with r, and let Python pre-treat the string - applying the \'s before passing the string to the regex engine: this ensures \x1bis passed as the unicode code point for the <ESC> character. Also, the double slash before "[" to indicate it as literal.

The second thing is fixing the "m" in the regex as it is the command to actually change color attributes, without requiring any specific color to be set.

And last, but not least, use regex character range with [ ] to say I want a word with caps (and digits) only, before another ANSI attribute command.

In [265]: a = s1 = '      Elapsed: 0:00:59.694 - Elapsed/GB: 

In [266]: re.findall("\x1b\\[.*?m([A-Z0-9] ?)\x1b\\[", a)
Out[266]: ['PASS']

CodePudding user response:

You can use the following if you know what the characters to avoid are in advance:

import re

s1 = '      Elapsed: 0:00:59.694 - Elapsed/GB: 0:00:00.125 - Result: \x1b[92mPASS\x1b[0m\r\n'
result = re.findall('- Result: \x1b\[92m(. )\x1b\[0m', s1)
print(result)

This prints:

['PASS']
  • Related