Home > OS >  How to webscrape to items out of table (with python and beautifulsoup)
How to webscrape to items out of table (with python and beautifulsoup)

Time:07-28

I want to webscrape this webpage (caranddriver.com). Therefore, I want to get the trim lines and prices out of a table.

As I just started coding I would higly appreciate your input! Thanks in advance!! :)

Desired output:

SE, $42,000 
SEL, $48,000 
Limited, $53,000 

Code as of now:

from bs4 import BeautifulSoup
import requests

#Inputs/URLs to scrape: 
URL = ('https://www.caranddriver.com/hyundai/ioniq-5')
(response := requests.get(URL)).raise_for_status()
soup = BeautifulSoup(response.text, 'lxml')
overview = soup.find()

for a in soup.find('g', class_='trims').find_all('foreignObject'):   
    trim = a.find('span', class_='css-afhlgr e1pdf2xh2').text
    msrp_trim = a.find('span', class_='css-4f1oub e1pdf2xh1').text
    print(trim, msrp_trim)

CodePudding user response:

You can use soup.select to get all the data from foreignObject tag.

type_price = soup.select('foreignObject')
for t, p in zip(type_price[::2], type_price[1::2]):
    print(t.text, p.text, sep = ', ')

This gives us the expected output :

SE, $42,000 (est)
SEL, $48,000 (est)
Limited, $53,000 (est)

CodePudding user response:

Try:

import requests
from bs4 import BeautifulSoup

url = "https://www.caranddriver.com/hyundai/ioniq-5"

soup = BeautifulSoup(requests.get(url).content, "html.parser")

data = iter(soup.select("foreignObject span"))
for trim, msrp_trim in zip(data, data):
    print("{:<10} {}".format(trim.text, msrp_trim.text))

Prints:

SE         $42,000 (est)
SEL        $48,000 (est)
Limited    $53,000 (est)
  • Related