I have an issue related to scraping date from a website using python and Beautifulsoup like there I am facing the splitting issue where .split('.', "")
is not working on scraping only date from this p tag <p >Oct 24, 2017 • 4 min read</p>
Actually I don't want this dot and 4 min read from this p tag
Published_Date = soup.select_one('p[]').get('datetime')
CodePudding user response:
The bold big dot
•
is different that.
dot you are using in split() method.So replace the bold big dot with a symbol and split that symbol and take the first value using list slicing
Example:
from bs4 import BeautifulSoup
html ='''
<p >Oct 24, 2017 • 4 min read</p>
'''
soup = BeautifulSoup(html,'html.parser')
date = soup.select_one('p.text-xs').get_text(strip=True)
print(date.replace('•','|').split('|')[0])
Output:
Oct 24, 2017