Please note i have read other answers on here but they havent worked for me (or I have applied them incorrectly, sorry if I have).
I have a list which I have then converted to Dataframe. I have then converted to string using:
df['URL'] = pd.Series(df['URL'], dtype="string")
However, when i go to use .find, .partition I get the error:
df['URL'].find('entry/')
AttributeError: 'Series' object has no attribute 'find'
string is as follows and i need to get the unique number between 'entry/' and '/event'. How can i do this?
https://fantasy.premierleague.com/entry/349289/event/14
CodePudding user response:
You have to use Series.str
to access values of the series as strings so that you can start applying the string method(like .find
, partition
).
But a better approach in this case would be use extract
which allows to extract capture groups in the regex entry/(\d )/event
as columns
df['URL'].str.extract("entry/(\d )/event", expand=False)
CodePudding user response:
If you just had a plain string (the URL) then you could isolate the value using a regular expression like this:
import re
url = 'https://fantasy.premierleague.com/entry/349289/event/14'
if (g := re.search(r'(?<=entry/)(\d ?)(?=/event)', url)):
print(g.group(1))
else:
print('Not found')
Output:
349289