Home > Net >  Scraping a website that doesn't have specific tags with classes
Scraping a website that doesn't have specific tags with classes

Time:11-22

So I am scraping a used car website I've got the make, model, year, and miles but I don't know how to get the others due to them being the li tag as well. I've put all my code here

from bs4 import BeautifulSoup
import requests
import pandas as pd
url = 'https://jammer.ie/used-cars'
response = requests.get(url)
response.status_code
soup = BeautifulSoup(response.content, 'html.parser')
soup
results = soup.find_all('div', {'class': 'span-9 right-col'})
len(results)
results[0].find('h6',{'class':'car-make'}).get_text()
results[0].find('p', {'class':'model'}).get_text()
results[0].find('p', {'class': 'year'}).get_text()
results[0].find('li').get_text().replace('\n', "")

I get the information I want from the above code but for other parts of the li tags they have img tags and span tags how can I get the information from each of the li tags? I am new to python so would like it to be somewhat simply and explained to me please

I tired using the img tag but don't think I used it right.

CodePudding user response:

To get all features into a dataframe you can do:

import requests
import pandas as pd
from bs4 import BeautifulSoup


url = "https://jammer.ie/used-cars"
soup = BeautifulSoup(requests.get(url).text, "html.parser")

all_data = []
for car in soup.select(".car"):
    info = car.select_one(".top-info").get_text(strip=True, separator="|")
    make, model, year, price = info.split("|")

    features = {}
    for feature in car.select(".car--features li"):
        k = feature.img["src"].split("/")[-1].split(".")[0]
        v = feature.span.text
        features[f"feature_{k}"] = v

    all_data.append(
        {"make": make, "model": model, "year": year, "price": price, **features}
    )

df = pd.DataFrame(all_data)
print(df.to_markdown(index=False))

Prints:

make model year price feature_speed feature_engine feature_transmission feature_owner feature_door-icon1 feature_petrol5 feature_paint feature_hatchback
Ford Fiesta 2010 €5,950 113144 miles 1.4 litres Manual 4 previous owners 5 doors Diesel Silver Hatchback
Volkswagen Polo 2013 Price on application 41000 miles 1.2 litres Automatic nan 5 doors Petrol Blue Hatchback
Volkswagen Polo 2015 Price on application 27000 miles 1.2 litres Automatic nan 5 doors Petrol Red Hatchback
Audi A1 2014 Price on application 45000 miles 1.4 litres Automatic nan 3 doors Petrol White Hatchback
Audi A3 2014 Price on application 79000 miles 1.4 litres Automatic nan 5 doors Petrol White Hatchback
Audi A3 2008 €4,450 147890 miles 1.6 litres Automatic 3 previous owners 3 doors Petrol Black Hatchback
SEAT Alhambra 2018 €29,950 134000 miles 2.0 litres Manual 2 previous owners 5 doors Diesel White MPV
Volkswagen Jetta 2014 €8,950 138569 miles 1.6 litres Manual 3 previous owners 4 doors Diesel Grey Saloon
Volkswagen Beetle 2014 Price on application 66379 miles 1.2 litres Automatic 1 previous owners 2 doors Petrol Black Hatchback
Volvo XC60 2019 €44,950 38214 miles 2.0 litres Automatic 1 previous owners 5 doors Diesel Black Estate
Toyota Aqua 2014 Price on application 67405 miles 1.5 litres Automatic 1 previous owners 5 doors nan White Hatchback
Audi A3 2014 Price on application 51182 miles 1.4 litres Automatic 1 previous owners 4 doors Petrol Black Saloon
Volkswagen Golf 2014 Price on application 68066 miles 1.2 litres Automatic 1 previous owners 5 doors Petrol Blue Hatchback
  • Related