extract the domain name from the urls in another list. Also you need to extract the ending string which the url ends with. For example, https://www.example.com/market.php -- In this example, domain name is www.example.com and the ending string is php
Extract the domains and the ending string
# List of urls
url_list = ['https://blog.hubspot.com/marketing/parts-url',
'https://www.almabetter.com/enrollments',
'https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rename.html',
'https://www.programiz.com/python-programming/list']
CodePudding user response:
Use urlparse(url)
from urllib
! (from urllib.parse import urlparse
):
parsed_url = urlparse(url)
domain = parsed_url.netloc
ending = parsed_url.path.split('.')[-1]