I have a piece of HTML that includes a datetime like this
<time datetime="2023-01-06 05:00:00" data-format="article-display" data-show-date="always" data-show-time="today-only" data-timestamp="1672981200" itemprop="datePublished" full-date="05.01.2023">6th January</time>
I've used the copy JS from Chrome inspector and had this returned
#article > div.mar-article > div > div.mar-article__timestamp > time
def extract_time(data):
"""Extract the time from the HTML of the article page."""
soup = BeautifulSoup(data, 'html.parser')
# Use the select_one() method to find the time element
time_element = soup.find("time", class_="datetime")
print(time_element)
return time_element
Why does it return None?
I'm confused as I don't know how to return just the datetime.
CodePudding user response:
The element do not have a class called datetime
but you could select it by its attribute datetime
(provided that the corresponding element is also present in the soup):
soup.select_one('time[datetime]').get('datetime')
Example
from bs4 import BeautifulSoup
soup = BeautifulSoup('<time datetime="2023-01-06 05:00:00" data-format="article-display" data-show-date="always" data-show-time="today-only" data-timestamp="1672981200" itemprop="datePublished" full-date="05.01.2023">6th January</time>')
soup.select_one('time[datetime]').get('datetime')
Output
2023-01-06 05:00:00