split strings without removing splitter-CodePudding

I have the following text,

text = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"

And I want the output be:

12345678 abcdefg
37394822 gdzdnhqihdzuiew 
09089799 
78998728 gdjewdwq

I tried "re.split("\d{8}", text)", but the result is incorrect. How to get the correct output?

CodePudding user response：

You can use "Lookahead"

Regex Tutorial - Lookahead and Lookbehind Zero-Length Assertions

import re
text = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"
arr = re.split(r"\s (?=\d)", text)
print(arr)

CodePudding user response：

IIUC, you looking to pair the numeric part with the alphanumeric and numeric will always be the first on each line

not an elegant of solution but addresses the question

splitted_txt = txt.split(' ')
i=0
while (i < (len(splitted_txt))):
    if (splitted_txt[i].isdigit() & ~(splitted_txt[i 1].isdigit())  ):
        print(splitted_txt[i], splitted_txt[i 1] )
        i =1
    else:
        print(splitted_txt[i])
    i =1

12345678 abcdefg
37394822 gdzdnhqihdzuiew
09089799
78998728 gdjewdwq