Home > Mobile >  remove hexadecimals of different lengths in a string with regex in python
remove hexadecimals of different lengths in a string with regex in python

Time:03-26

I have a string with a bunch of hexadecimals in them that are all different lengths.

my_string = '0x5842 YDB2: 0x4035, 0x249, 0x4, 0x9550, 0x613, 0x61764585, 0x784, 0x472, 0x7550, 0x48844926, 0x69760606, 0x9179, 0x753 rev: 0x259, 0x63, 0x4808 0x43, 0x92039244, 0x612, 0x991, 0x26, 0x64 0x7, 0x92, 0x40906120, 0x11812410,'

I'm trying to find all these hexadecimals and remove them from my string, so I'll end up with something like this:

my_string = 'YDB2: rev:'

I've tried the following regex expression:

'0x[0-9] ,?'

The problem with this is that sometimes, it finds the "minimum" expression to fulfill the requirement of the regex. For example, with something like 0x9550, it will find 0x9, see that it fulfills the regex expression and return that, ignoring the 550.

Any help would be greatly appreciated!

CodePudding user response:

Without a non-greedy quantifier ( ? instead of ) your regexp will not "sometimes" find different strings. I suspect your actual data might have characters you're not accounting for.

A slight variation of your regexp seems to work fine for your example data:

>>> my_string = '0x5842 YDB2: 0x4035, 0x249, 0x4, 0x9550, 0x613, 0x61764585, 0x784, 0x472, 0x7550, 0x48844926, 0x69760606, 0x9179, 0x753 rev: 0x259, 0x63, 0x4808 0x43, 0x92039244, 0x612, 0x991, 0x26, 0x64 0x7, 0x92, 0x40906120, 0x11812410,'
>>> re.sub('0x[0-9a-f] [, ]*', '', my_string, flags=re.I)
'YDB2: rev: '
>>>
  • Related