I have the DataFrame:
link = [{'name': 'http://website.com/product76tre53932'}, {'name': 'http://website.it/productiee8340'}, {'name': 'http://website.de/productooi7309'}]
df = pd.DataFrame(link)
How I can cut values that get the next result, which you can see in the df['name_2]
column:
enter image description here
CodePudding user response:
You can use urllib.parse
module to parse those URLs.
>>> from urllib.parse import urlsplit
>>>
>>> def create_url(url):
... r = urlsplit(url)
... return f"{r.scheme}://{r.netloc}"
...
>>> link = [{'name': 'http://website.com/product76tre53932'}, {'name': 'http://website.it/productiee8340'}, {'name': 'http://website.de/productooi7309'}]
>>>
>>> import pandas as pd
>>> df = pd.DataFrame(link)
>>> df['new_url'] = df.name.apply(create_url)
>>> df
name new_url
0 http://website.com/product76tre53932 http://website.com
1 http://website.it/productiee8340 http://website.it
2 http://website.de/productooi7309 http://website.de
>>>