I have the following text,
text = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"
And I want the output be:
12345678 abcdefg
37394822 gdzdnhqihdzuiew
09089799
78998728 gdjewdwq
I tried "re.split("\d{8}", text)", but the result is incorrect. How to get the correct output?
CodePudding user response:
You can use "Lookahead"
Regex Tutorial - Lookahead and Lookbehind Zero-Length Assertions
import re
text = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"
arr = re.split(r"\s (?=\d)", text)
print(arr)
CodePudding user response:
IIUC, you looking to pair the numeric part with the alphanumeric and numeric will always be the first on each line
not an elegant of solution but addresses the question
splitted_txt = txt.split(' ')
i=0
while (i < (len(splitted_txt))):
if (splitted_txt[i].isdigit() & ~(splitted_txt[i 1].isdigit()) ):
print(splitted_txt[i], splitted_txt[i 1] )
i =1
else:
print(splitted_txt[i])
i =1
12345678 abcdefg
37394822 gdzdnhqihdzuiew
09089799
78998728 gdjewdwq