Home > other >  Split string and still keep the delimiter
Split string and still keep the delimiter

Time:12-09

I have a string of code like this:

replace [IntType]
import TYPE [libc_to_basic_type_entry*]

Now I want to split them into arrays by using Python regex like this:

["replace", "[", "IntType", "]"]
["import", "TYPE", "[", "libc_to_basic_type_entry, "*", "]"]

What is the best way to do this? Thanks.


At first, I try to do simple string.split("[") and keep looping over the string to split with the other characters. But i found that this way is not effective, so I would like to ask for help by using regex.

CodePudding user response:

You may use this regex:

\s*(\w |[^\w\s])\s*

RegEx Details:

  • \s*: Match 0 or more whitespaces
  • (: Start capture group
    • \w : Match 1 word characters
    • |: OR
    • [^\w\s]: Match a character that is not a word or whitespace character
  • ): End capture group
  • \s*: Match 0 or more whitespaces

Code:

s = 'import TYPE [libc_to_basic_type_entry*]'
print (re.findall(r'\s*(\w |[^\w\s])\s*', s))

Output:

['import', 'TYPE', '[', 'libc_to_basic_type_entry', '*', ']']
  • Related