Home > OS >  Python split string retaining the bracket
Python split string retaining the bracket

Time:12-16

I would like to split the string and eliminate the whitespaces such as

double a[3] = {0.0, 0.0, 0.0};

The expected output is

['double', 'a', '[', '3', ']', '=', '{', '0.0', ',', '0.0', ',', '0.0', '}', ';']

How could I do that with re module in Python?

CodePudding user response:

You can make use of the fact that re.split() retains delimiters in capture groups in the output:

import re
input_string = "double a[3] = {0.0, 0.0, 0.0};"
bits = [bit for bit in (bit.strip() for bit in re.split(r'((?:\d \.\d )|[,}=;]|\w )', input_string)) if bit]
expected = ['double', 'a', '[', '3', ']', '=', '{', '0.0', ',', '0.0', ',', '0.0', '}', ';']
assert bits == expected

CodePudding user response:

One approach here might be to use re.findall:

inp = "double a[3] = {0.0, 0.0, 0.0};"
parts = re.findall(r'\d (?:\.\d )?|\w |[^\s\w]', inp)
print(parts)

# ['double', 'a', '[', '3', ']', '=', '{', '0.0', ',', '0.0', ',', '0.0', '}', ';']

The regex pattern used here says to match:

  • \d (?:\.\d )? an integer or float
  • | OR
  • \w a word (such as "double")
  • | OR
  • [^\s\w] a single non word non whitespace (such as {)
  • Related