Home > Software design >  For Loop to CSV Leading to Uneven Rows in Python
For Loop to CSV Leading to Uneven Rows in Python

Time:05-22

Still learning Python, so apologies if this is an extremely obvious mistake. I've been trying to figure it out for hours now though and figured I'd see if anyone can help out.

I've scraped a hockey website for their ice skate name and price and have written it to a CSV. The only problem is that when I write it to CSV the rows for the name column (listed as Gear) and the Price column are not aligned. It goes:

  • Gear Name 1
  • Row Space
  • Price
  • Row Space
  • Gear Name 2

It would be great to align the gear and price rows next to each other. I've attached a link to a picture of the CSV as well if that helps.

import requests
from bs4 import BeautifulSoup as Soup

webpage_response = requests.get('https://www.purehockey.com/c/ice-hockey-skates-senior?')

webpage = (webpage_response.content)
parser = Soup(webpage, 'html.parser')


filename = "gear.csv"
f = open(filename, "w")

headers = "Gear, Price"
f.write(headers)

for gear in parser.find_all("div", {"class": "details"}):
    
    gearname = gear.find_all("div", {"class": "name"}, "a")
    gearnametext = gearname[0].text
    
    gearprice = gear.find_all("div", {"class": "price"}, "a")
    gearpricetext = gearprice[0].text

    print (gearnametext)
    print (gearpricetext)

    f.write(gearnametext   ","   gearpricetext)

[What the uneven rows look like][1] [1]: https://i.stack.imgur.com/EG2f2.png

CodePudding user response:

I've noticed that gearnametext returns 2\n inside the string. You should try the method str.replace() to remove the \n which are creating you the jump to the next line. Try with:

import requests
from bs4 import BeautifulSoup as Soup

webpage_response = requests.get('https://www.purehockey.com/c/ice-hockey-skates-senior?')

webpage = (webpage_response.content)
parser = Soup(webpage, 'html.parser')


filename = "gear.csv"
f = open(filename, "w")

headers = "Gear, Price"
f.write(headers)

for gear in parser.find_all("div", {"class": "details"}):
    
    gearname = gear.find_all("div", {"class": "name"}, "a")
    gearnametext = gearname[0].text.replace('\n','')

    gearprice = gear.find_all("div", {"class": "price"}, "a")
    gearpricetext = gearprice[0].text

    print (gearnametext)
    print (gearpricetext)

    f.write(gearnametext   ","   gearpricetext)

I changed inside the loop the second line for the gear name for: gearnametext = gearname[0].text.replace('\n','').

CodePudding user response:

Would recommend with python 3 to use with open(filename, 'w') as f: and strip() your texts before write() to your file.

Unless you do not use 'a' mode to append each line you have to add linebreak to each line you are writing.

Example
import requests
from bs4 import BeautifulSoup as Soup

webpage_response = requests.get('https://www.purehockey.com/c/ice-hockey-skates-senior?')

webpage = (webpage_response.content)
parser = Soup(webpage, 'html.parser')


filename = "gear1.csv"
headers = "Gear,Price\n"


with open(filename, 'w') as f:
    f.write(headers)

    for gear in parser.find_all("div", {"class": "details"}):
        gearnametext = gear.find("div", {"class": "name"}).text.strip()
        gearpricetext = gear.find("div", {"class": "price"}).text.strip()
        f.write(gearnametext   ","   gearpricetext "\n")
Output
Gear,Price
Bauer Vapor X3.7 Ice Hockey Skates - Senior,$249.99
Bauer X-LP Ice Hockey Skates - Senior,$119.99
Bauer Vapor Hyperlite Ice Hockey Skates - Senior,$999.98 - $1149.98
CCM Jetspeed FT475 Ice Hockey Skates - Senior,$249.99
Bauer X-LP Ice Hockey Skates - Intermediate,$109.99

...

  • Related