Home > Software design >  How to get each element from this website (webscraping with python)
How to get each element from this website (webscraping with python)

Time:07-26

I want to webscrape this webpage (www.autocar.co.uk). Therefore, I would like to select each Automaker in a dropdown menu. Ideally this is coded with a FOR loop to get all the entries.

As I just started coding I would higly appreciate your input!

Desired output:

Make
Abarth
AC Cars
AC Schnitzer
Aiways
Allard
...
Zyote

My code as of now:

from bs4 import BeautifulSoup
import requests


#Inputs/URLs to scrape: 
URL = ('https://www.autocar.co.uk/car-review/tesla/model-3/specs')
(response := requests.get(URL)).raise_for_status()
soup = BeautifulSoup(response.text, 'lxml')
overview = soup.find()

oem_url = overview.find('select', class_='car-finder-make-chooser')

print(oem_url)

Output as of now:

<select  id="edit-make" name="make"><option selected="" value="0">Make</option><option value="1286">Abarth</option><option value="2456">AC Cars</option><option value="2104">AC Schnitzer</option><option value="2112">Aiways</option><option value="1250">Allard</option><option value="90">Alfa Romeo</option><option value="91">Alpina</option><option value="1502">Alpine</option><option value="92">Ariel</option><option value="94">Ascari</option><option value="95">Aston Martin</option><option value="93">Audi</option><option value="97">BAC</option><option value="96">Bentley</option><option value="2745">Bizzarrini</option><option value="98">BMW</option><option value="1758">Borgward</option><option value="99">Bowler</option><option value="100">Bugatti</option><option value="1149">BYD</option><option value="1776">Byton</option><option value="101">Cadillac</option><option value="102">Caparo</option><option value="103">Caterham</option><option value="2805">Caton</option><option value="1719">Changan Auto</option><option value="104">Chevrolet</option><option value="105">Chrysler</option><option value="106">Citroen</option><option value="1916">Cupra</option><option value="1965">De Tomaso</option><option value="108">Dacia</option><option value="1823">Dallara</option><option value="1160">David Brown</option><option value="2800">DeLorean</option><option value="109">Dodge</option><option value="2338">Donkervoort</option><option value="1275">DS</option><option value="2042">Dyson</option><option value="1158">Eagle</option><option value="2808">Electrogenic</option><option value="1066">Elemental</option><option value="110">Eterniti</option><option value="111">Ferrari</option><option value="112">Fiat</option><option value="113">Fisker</option><option value="114">Ford</option><option value="1166">Geely</option><option value="2485">Genesis</option><option value="115">Ginetta</option><option value="2336">Gordon Murray Automotive</option><option value="116">Great Wall</option><option value="2549">GTO Engineering</option><option value="117">Gumpert</option><option value="2762">Gunther Werks</option><option value="1309">Hennessey</option><option value="1956">Hispano Suiza</option><option value="118">Honda</option><option value="1152">Hongqi</option><option value="2465">Human Horizons</option><option value="119">Hyundai</option><option value="2194">Ineos</option><option value="120">Infiniti</option><option value="1259">Isuzu</option><option value="1144">ItalDesign</option><option value="121">Jaguar</option><option value="2168">Jannarelly</option><option value="2151">JCB</option><option value="122">Jeep</option><option value="1715">JIA</option><option value="1255">Ken Okuyama</option><option value="123">Kia</option><option value="2644">Kimera</option><option value="2641">Kingsley Cars</option><option value="124">Koenigsegg</option><option value="125">KTM</option><option value="126">Lada</option><option value="127">Lamborghini</option><option value="1302">Lancia</option><option value="128">Land Rover</option><option value="129">Lexus</option><option value="2821">Lightyear</option><option value="1739">Lincoln</option><option value="130">Lotus</option><option value="2458">Lucid</option><option value="1765">Lynk &amp; Co</option><option value="1373">Mahindra</option><option value="131">Marcos</option><option value="132">Maserati</option><option value="133">Maybach</option><option value="134">Mazda</option><option value="135">McLaren</option><option value="923">Mercedes-AMG</option><option value="136">Mercedes-Benz</option><option value="1167">Mercedes-Maybach</option><option value="137">MG Motor</option><option value="139">Mini</option><option value="138">Mia</option><option value="140">Mitsubishi</option><option value="2385">MK Sportscars</option><option value="141">Morgan</option><option value="1840">MS-RT</option><option value="2505">MST</option><option value="142">Murray</option><option value="1487">NextEV</option><option value="1934">Nio</option><option value="143">Nissan</option><option value="144">Noble</option><option value="1808">Oldsmobile</option><option value="1231">Opel</option><option value="145">Pagani</option><option value="146">Perodua</option><option value="147">Peugeot</option><option value="1816">Pininfarina</option><option value="2568">Praga</option><option value="2071">Polestar</option><option value="148">Porsche</option><option value="149">Proton</option><option value="150">Qoros</option><option value="2553">Radford</option><option value="151">Radical</option><option value="1919">Ram</option><option value="152">Renault</option><option value="2499">Revology</option><option value="2787">Revolution</option><option value="2482">Rimac</option><option value="2671">Riversimple</option><option value="2761">Rivian</option><option value="2514">Rodin</option><option value="1142">Roewe</option><option value="153">Rolls-Royce</option><option value="154">Saab</option><option value="155">Seat</option><option value="1147">Senova</option><option value="1305">Shelby</option><option value="1709">Sin</option><option value="156">Skoda</option><option value="157">Smart</option><option value="2788">Smit Oletha</option><option value="158">Spyker</option><option value="159">SRT</option><option value="160">Ssangyong</option><option value="161">SSC</option><option value="162">Subaru</option><option value="163">Suzuki</option><option value="164">Tata</option><option value="165">Tesla</option><option value="1239">Tiger</option><option value="166">Toniq</option><option value="2539">Touring Superleggera</option><option value="167">Toyota</option><option value="1798">Triumph</option><option value="168">Tushek</option><option value="1110">TVR</option><option value="2754">Twisted</option><option value="169">Vauxhall</option><option value="891">Vencer</option><option value="170">Veritas</option><option value="2730">Vinfast</option><option value="171">Volkswagen</option><option value="172">Volvo</option><option value="2757">Voyah</option><option value="173">Vuhl</option><option value="2623">Wells</option><option value="2791">Wiesmann</option><option value="174">Westfield</option><option value="2142">Xpeng</option><option value="2659">Zeal Motor</option><option value="2662">Zeekr</option><option value="175">Zenos</option><option value="176">Zenvo</option><option value="1226">Zolfe</option><option value="1235">Zoyte</option></select>
[Finished in 1.8s]

CodePudding user response:

The correct code for the current BeautifulSoup version would be:

import requests
from bs4 import BeautifulSoup 
url = "http://www.autocar.co.uk/"
r = requests.get(url)
soup = BeautifulSoup(r.text,'html.parser')
car_list = [x.text for x in soup.select_one('#edit-make').select('option')][1:]
print(car_list)

Result: ['Abarth', 'AC Cars', 'AC Schnitzer', 'Aiways', 'Allard', 'Alfa Romeo', 'Alpina', 'Alpine', 'Ariel', 'Ascari', 'Aston Martin', 'Audi', 'BAC',...]

Also, it's find_all not findAll, if you want to use that. You can find the BeautifulSoup (bs4) documentation at https://www.crummy.com/software/BeautifulSoup/bs4/doc/

CodePudding user response:

You can try selecting using soup.select method.

options = soup.select('select.car-finder-make-chooser option')                      
for option in options:
    print(option.text)

This will give you the expected output :

Make
Abarth
AC Cars
AC Schnitzer
Aiways
Allard
Alfa Romeo
Alpina
Alpine
Ariel
Ascari
Aston Martin
Audi
BAC
Bentley
Bizzarrini
BMW
Borgward
Bowler
Bugatti
BYD
Byton
Cadillac
Caparo
Caterham
Caton
Changan Auto
Chevrolet
Chrysler
Citroen
Cupra
De Tomaso
Dacia
Dallara
David Brown
DeLorean
Dodge
Donkervoort
DS
Dyson
Eagle
Electrogenic
Elemental
Eterniti
Ferrari
Fiat
Fisker
Ford
Geely
Genesis
Ginetta
Gordon Murray Automotive
Great Wall
GTO Engineering
Gumpert
Gunther Werks
Hennessey
Hispano Suiza
Honda
Hongqi
Human Horizons
Hyundai
Ineos
Infiniti
Isuzu
ItalDesign
Jaguar
Jannarelly
JCB
Jeep
JIA
Ken Okuyama
Kia
Kimera
Kingsley Cars
Koenigsegg
KTM
Lada
Lamborghini
Lancia
Land Rover
Lexus
Lightyear
Lincoln
Lotus
Lucid
Lynk & Co
Mahindra
Marcos
Maserati
Maybach
Mazda
McLaren
Mercedes-AMG
Mercedes-Benz
Mercedes-Maybach
MG Motor
Mini
Mia
Mitsubishi
MK Sportscars
Morgan
MS-RT
MST
Murray
NextEV
Nio
Nissan
Noble
Oldsmobile
Opel
Pagani
Perodua
Peugeot
Pininfarina
Praga
Polestar
Porsche
Proton
Qoros
Radford
Radical
Ram
Renault
Revology
Revolution
Rimac
Riversimple
Rivian
Rodin
Roewe
Rolls-Royce
Saab
Seat
Senova
Shelby
Sin
Skoda
Smart
Smit Oletha
Spyker
SRT
Ssangyong
SSC
Subaru
Suzuki
Tata
Tesla
Tiger
Toniq
Touring Superleggera
Toyota
Triumph
Tushek
TVR
Twisted
Vauxhall
Vencer
Veritas
Vinfast
Volkswagen
Volvo
Voyah
Vuhl
Wells
Wiesmann
Westfield
Xpeng
Zeal Motor
Zeekr
Zenos
Zenvo
Zolfe
Zoyte

Alternatively you can try

options = soup.find('select', class_='car-finder-make-chooser').find_all('option')                      
for option in options:
    print(option.text)
  • Related