Home > OS >  web scraping fail to be converted to csv columns
web scraping fail to be converted to csv columns

Time:08-18

After running the following code, the value of the third column in CSV is missing. I can print the value of the third variable, Jackpot, though.

When the print Jackpot, it looks like

 $
 450,000,000
from bs4 import BeautifulSoup
import requests
import csv

source = requests.get('https://www.lotto.net/mega-millions/numbers/2018').text
soup = BeautifulSoup(source, 'lxml')
jackpots = soup.find_all('div', class_="results-vsmall archive-list")
print(jackpots)


#next sibling
#https://stackoverflow.com/questions/36454693/python-selenium-how-to-get-text-from-a-div-after-a-span
csv_file = open('cms_scrape.csv', 'w')
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['Date', 'Weekday', 'Jackpot'])

for jackpot in jackpots:
      Weekday = jackpot.find("div", class_="date").span.text
      print(Weekday)
      Date = jackpot.find('div', class_ = "date").span.next_sibling
      print(Date)
      Jackpot = jackpot.find('div', class_='jackpot').span.text
      print(Jackpot)
      csv_writer.writerow([Date, Weekday, Jackpot])

csv_file.close()

CodePudding user response:

To get correct values strip all the unnecessary whitespace characters in your data:

import csv
import requests
from bs4 import BeautifulSoup

source = requests.get("https://www.lotto.net/mega-millions/numbers/2018").text
soup = BeautifulSoup(source, "lxml")
jackpots = soup.find_all("div", class_="results-vsmall archive-list")

# next sibling
# https://stackoverflow.com/questions/36454693/python-selenium-how-to-get-text-from-a-div-after-a-span
csv_file = open("cms_scrape.csv", "w")
csv_writer = csv.writer(csv_file)
csv_writer.writerow(["Date", "Weekday", "Jackpot"])

for jackpot in jackpots:
    Weekday = jackpot.find("div", class_="date").span.text.strip()
    Date = jackpot.find("div", class_="date").span.next_sibling.text.strip()
    Jackpot = " ".join(jackpot.find("div", class_="jackpot").span.text.split())
    csv_writer.writerow([Date, Weekday, Jackpot])

csv_file.close()

Creates cms_scrape.csv (screenshot from LibreOffice):

enter image description here

  • Related