While scrapping a site for data i got that error. Some of the dates are in mm d, yyyy format while others are in mm dd,yyyy. I've read the documentation and tried different solutions on stackoverflow but nothing seems to work.
import requests
from datetime import datetime
def jobScan(link):
the_job = {}
jobUrl = link['href']
the_job['urlLink'] = jobUrl
jobs = requests.get(jobUrl, headers = headers )
jobC = jobs.content
jobSoup = BeautifulSoup(jobC, "lxml")
table = soup.find_all("a", attrs = {"class": "job-details-link"})
postDate = jobSoup.find_all("span", {"class": "job-date__posted"})[0]
postDate = postDate.text
date_posted = datetime.strptime(postDate, '%B %d, %Y')
the_job['date_posted'] = date_posted
closeDate = jobSoup.find_all("span", {"class": "job-date__closing"})[0]
closeDate = closeDate.text
closing_date = datetime.strptime(closeDate, '%B %d, %Y')
the_job['closing_date'] = closing_date
return the_job
however i get this error
ValueError: time data '\nJuly 4, 2022\n' does not match format '%B %d, %Y'
and when i try the other format i get this
ValueError: '-' is a bad directive in format '%B %-d, %Y'
What could I probably be doing wrong?
CodePudding user response:
Try:
date_posted = datetime.strptime(postDate.replace('\n',''), '%B %d, %Y')