Home > other >  How do I fix "ValueError: time data '\nJuly 4, 2022\n' does not match format '
How do I fix "ValueError: time data '\nJuly 4, 2022\n' does not match format '

Time:07-08

While scrapping a site for data i got that error. Some of the dates are in mm d, yyyy format while others are in mm dd,yyyy. I've read the documentation and tried different solutions on stackoverflow but nothing seems to work.

import requests
from datetime import datetime

def jobScan(link):
     
    the_job = {}

    jobUrl = link['href']
    the_job['urlLink'] = jobUrl
   
    jobs = requests.get(jobUrl, headers = headers )
    jobC = jobs.content
    jobSoup = BeautifulSoup(jobC, "lxml")

    table = soup.find_all("a", attrs = {"class": "job-details-link"})

    postDate = jobSoup.find_all("span", {"class": "job-date__posted"})[0]
    postDate = postDate.text
    date_posted = datetime.strptime(postDate, '%B %d, %Y')
    the_job['date_posted'] = date_posted

    closeDate = jobSoup.find_all("span", {"class": "job-date__closing"})[0]
    closeDate = closeDate.text
    closing_date = datetime.strptime(closeDate, '%B %d, %Y')
    the_job['closing_date'] = closing_date
    
    return the_job

however i get this error

ValueError: time data '\nJuly 4, 2022\n' does not match format '%B %d, %Y'

and when i try the other format i get this

ValueError: '-' is a bad directive in format '%B %-d, %Y'

What could I probably be doing wrong?

CodePudding user response:

Try:

date_posted = datetime.strptime(postDate.replace('\n',''), '%B %d, %Y')
  • Related