I am trying to fetch the number from below texts
text = 'was the 3001st most popular'
text1 = 'was the 2733rd most popular'
text3 = 'was the 3072nd most popular'
text4 = 'was the 4747th most popular'
I want it to return the numbers as in 3001, 2733, 3072, 4747
respectively from each line
i am using below
r = re.search('was the (.*)nd|was the (.*)st|was the (.*)rd|was the (.*)th', text).groups()
reg = next((item for item in r if item is not None), 'Not-Found')
But it's producing 4747th mo, 3001st mo etc
CodePudding user response:
You can simplify you pattern to look for digits (\d
) followed by either of st
, rd
, nd
, th
:
texts = [text, text1, text3, text4]
for t in texts:
r = re.search("(\d )(st|rd|nd|th)", t).group(1)
print(r)
Output:
3001
2733
3072
4747