After running the following code, the value of the third column in CSV is missing. I can print the value of the third variable, Jackpot, though.
When the print Jackpot, it looks like
$
450,000,000
from bs4 import BeautifulSoup
import requests
import csv
source = requests.get('https://www.lotto.net/mega-millions/numbers/2018').text
soup = BeautifulSoup(source, 'lxml')
jackpots = soup.find_all('div', class_="results-vsmall archive-list")
print(jackpots)
#next sibling
#https://stackoverflow.com/questions/36454693/python-selenium-how-to-get-text-from-a-div-after-a-span
csv_file = open('cms_scrape.csv', 'w')
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['Date', 'Weekday', 'Jackpot'])
for jackpot in jackpots:
Weekday = jackpot.find("div", class_="date").span.text
print(Weekday)
Date = jackpot.find('div', class_ = "date").span.next_sibling
print(Date)
Jackpot = jackpot.find('div', class_='jackpot').span.text
print(Jackpot)
csv_writer.writerow([Date, Weekday, Jackpot])
csv_file.close()
CodePudding user response:
To get correct values strip all the unnecessary whitespace characters in your data:
import csv
import requests
from bs4 import BeautifulSoup
source = requests.get("https://www.lotto.net/mega-millions/numbers/2018").text
soup = BeautifulSoup(source, "lxml")
jackpots = soup.find_all("div", class_="results-vsmall archive-list")
# next sibling
# https://stackoverflow.com/questions/36454693/python-selenium-how-to-get-text-from-a-div-after-a-span
csv_file = open("cms_scrape.csv", "w")
csv_writer = csv.writer(csv_file)
csv_writer.writerow(["Date", "Weekday", "Jackpot"])
for jackpot in jackpots:
Weekday = jackpot.find("div", class_="date").span.text.strip()
Date = jackpot.find("div", class_="date").span.next_sibling.text.strip()
Jackpot = " ".join(jackpot.find("div", class_="jackpot").span.text.split())
csv_writer.writerow([Date, Weekday, Jackpot])
csv_file.close()
Creates cms_scrape.csv
(screenshot from LibreOffice):