So I am scraping a used car website I've got the make, model, year, and miles but I don't know how to get the others due to them being the li tag as well. I've put all my code here
from bs4 import BeautifulSoup
import requests
import pandas as pd
url = 'https://jammer.ie/used-cars'
response = requests.get(url)
response.status_code
soup = BeautifulSoup(response.content, 'html.parser')
soup
results = soup.find_all('div', {'class': 'span-9 right-col'})
len(results)
results[0].find('h6',{'class':'car-make'}).get_text()
results[0].find('p', {'class':'model'}).get_text()
results[0].find('p', {'class': 'year'}).get_text()
results[0].find('li').get_text().replace('\n', "")
I get the information I want from the above code but for other parts of the li tags they have img tags and span tags how can I get the information from each of the li tags? I am new to python so would like it to be somewhat simply and explained to me please
I tired using the img tag but don't think I used it right.
CodePudding user response:
To get all features into a dataframe you can do:
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = "https://jammer.ie/used-cars"
soup = BeautifulSoup(requests.get(url).text, "html.parser")
all_data = []
for car in soup.select(".car"):
info = car.select_one(".top-info").get_text(strip=True, separator="|")
make, model, year, price = info.split("|")
features = {}
for feature in car.select(".car--features li"):
k = feature.img["src"].split("/")[-1].split(".")[0]
v = feature.span.text
features[f"feature_{k}"] = v
all_data.append(
{"make": make, "model": model, "year": year, "price": price, **features}
)
df = pd.DataFrame(all_data)
print(df.to_markdown(index=False))
Prints:
make | model | year | price | feature_speed | feature_engine | feature_transmission | feature_owner | feature_door-icon1 | feature_petrol5 | feature_paint | feature_hatchback |
---|---|---|---|---|---|---|---|---|---|---|---|
Ford | Fiesta | 2010 | €5,950 | 113144 miles | 1.4 litres | Manual | 4 previous owners | 5 doors | Diesel | Silver | Hatchback |
Volkswagen | Polo | 2013 | Price on application | 41000 miles | 1.2 litres | Automatic | nan | 5 doors | Petrol | Blue | Hatchback |
Volkswagen | Polo | 2015 | Price on application | 27000 miles | 1.2 litres | Automatic | nan | 5 doors | Petrol | Red | Hatchback |
Audi | A1 | 2014 | Price on application | 45000 miles | 1.4 litres | Automatic | nan | 3 doors | Petrol | White | Hatchback |
Audi | A3 | 2014 | Price on application | 79000 miles | 1.4 litres | Automatic | nan | 5 doors | Petrol | White | Hatchback |
Audi | A3 | 2008 | €4,450 | 147890 miles | 1.6 litres | Automatic | 3 previous owners | 3 doors | Petrol | Black | Hatchback |
SEAT | Alhambra | 2018 | €29,950 | 134000 miles | 2.0 litres | Manual | 2 previous owners | 5 doors | Diesel | White | MPV |
Volkswagen | Jetta | 2014 | €8,950 | 138569 miles | 1.6 litres | Manual | 3 previous owners | 4 doors | Diesel | Grey | Saloon |
Volkswagen | Beetle | 2014 | Price on application | 66379 miles | 1.2 litres | Automatic | 1 previous owners | 2 doors | Petrol | Black | Hatchback |
Volvo | XC60 | 2019 | €44,950 | 38214 miles | 2.0 litres | Automatic | 1 previous owners | 5 doors | Diesel | Black | Estate |
Toyota | Aqua | 2014 | Price on application | 67405 miles | 1.5 litres | Automatic | 1 previous owners | 5 doors | nan | White | Hatchback |
Audi | A3 | 2014 | Price on application | 51182 miles | 1.4 litres | Automatic | 1 previous owners | 4 doors | Petrol | Black | Saloon |
Volkswagen | Golf | 2014 | Price on application | 68066 miles | 1.2 litres | Automatic | 1 previous owners | 5 doors | Petrol | Blue | Hatchback |