Python find substring between two markers-CodePudding

Please note i have read other answers on here but they havent worked for me (or I have applied them incorrectly, sorry if I have).

I have a list which I have then converted to Dataframe. I have then converted to string using:

df['URL'] = pd.Series(df['URL'], dtype="string")

However, when i go to use .find, .partition I get the error:

df['URL'].find('entry/')

AttributeError: 'Series' object has no attribute 'find'

string is as follows and i need to get the unique number between 'entry/' and '/event'. How can i do this?

https://fantasy.premierleague.com/entry/349289/event/14

CodePudding user response：

You have to use Series.str to access values of the series as strings so that you can start applying the string method(like .find, partition).

But a better approach in this case would be use extract which allows to extract capture groups in the regex entry/(\d )/event as columns

df['URL'].str.extract("entry/(\d )/event", expand=False)

CodePudding user response：

If you just had a plain string (the URL) then you could isolate the value using a regular expression like this:

import re

url = 'https://fantasy.premierleague.com/entry/349289/event/14'

if (g := re.search(r'(?<=entry/)(\d ?)(?=/event)', url)):
    print(g.group(1))
else:
    print('Not found')

Output: