Home > Software engineering >  Why is my CSV file coming out empty after code runs w/no errors?
Why is my CSV file coming out empty after code runs w/no errors?

Time:10-11

I am very new to coding and was trying to get a basic webscraping code to work. The code works just fine, the problem is that I cannot get the CSV file to have any information on it. Any help would be appreciated.

from bs4 import BeautifulSoup

import requests

import csv

page_to_scrape = requests.get("https://www.scrapethissite.com/pages/")
soup = BeautifulSoup(page_to_scrape.text, "html.parser")
descriptions = soup.findAll("p", attrs=("class" == "lead session-desc"))
titles = soup.findAll("h3", attrs=("class" == "page-title"))

with open("scrapeinformation.csv", "w", newline="") as f:
    thewriter = csv.writer(f)

    for title, desc in zip(titles, descriptions):
        print(title.text   " - "   desc.text)
        thewriter.writerow([title.text, desc.text])
f.close()

CodePudding user response:

Are you absolutely sure the csv is empty? When I ran your code, I noticed that the file looked empty when I viewed on Excel, but not if I opened with notepad or Google Sheets, and also that print(title.text " - " desc.text) shows that the cell entries are surrounded in a lot of whitespace.

So, actually the Excel cells are just showing the whitespace at the beginning because the default format doesn't show more than what fits in the cell. I can see the contents after I:

  1. Select All by pressing Ctrl A, and then
  2. Toggle the Wrap Text setting by pressing Alt H W (try toggling once more if there seems to be no difference the first time)

However, the approach I would personally recommend here is to remove the whitespaces in the first place - you can do so by using the strip() method (like .text.strip()) or by using .get_text(strip=True) instead of .text.

  • Related