So I am scraping a website and the code gives me all the information I want however when scraping it also gives me the "€" symbol with the price. So I want to be able to have the price as a int and remove the "€" symbol so I can Calculate the average car price per year. It does give me the ValueError: invalid literal for int() with base 10: 'price' but when I try look at other questions on this website with the answer the solutions don't work for me. Year is also a string so would it make sense to convert the year to an int as well so I can do equations?
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = "https://jammer.ie/used-cars?page={}&per-page=12"
all_data = []
for page in range(1, 4): # <-- increase number of pages here
soup = BeautifulSoup(requests.get(url.format(page)).text, "html.parser")
for car in soup.select(".car"):
info = car.select_one(".top-info").get_text(strip=True, separator="|")
info = info.split("|")
if len(info) == 4:
make, model, year, price = info
else:
make, year, price = info
model = "N/A"
dealer_name = car.select_one(".dealer-name h6").get_text(
strip=True, separator=" "
)
address = car.select_one(".address").get_text(strip=True)
features = {}
for feature in car.select(".car--features li"):
k = feature.img["src"].split("/")[-1].split(".")[0]
v = feature.span.text
features[f"feature_{k}"] = v
all_data.append(
{
"make": make,
"model": model,
"year": year,
"price": price,
"dealer_name": dealer_name,
"address": address,
"url": "https://jammer.ie"
car.select_one("a[href*=vehicle]")["href"],
**features,
}
)
df = pd.DataFrame(all_data)
# prints sample data to screen:
print(df.tail().to_markdown(index=False))
# saves all data to CSV
df.to_csv("data.csv", index=False)
I tired converting using
df = pd.read_csv('data.csv', usecols= ['price','year'])
print(type("price"))
print(int("price"))
But this did not work out for me. I also tired converting it to a float as well which did not work too.
CodePudding user response:
You can define a custom function for that and apply it on new/existing column, like so:
pd = pd.DataFrame(
{"col1": [1,2,2,3,4],
"prices": ["1€", "2.2€", "5€","66€", "999€"]
}
)
# Use own function to create custom column
def remove_currency_sign(price: str, sign:str = "€")->int:
return int(price.replace(sign,""))
pd["new_col"] = pd["prices"].apply(remove_currency_sign)
print(pd)
CodePudding user response:
When you have data in pandas DataFrame you can do:
#... your code
df = pd.DataFrame(all_data)
df["price"] = df["price"].str.replace(r"[€,]", "", regex=True)
df["price"] = pd.to_numeric(df["price"], errors="coerce")
df["year"] = pd.to_numeric(df["year"], errors="coerce")
print(df)
Prints:
make model year price dealer_name address url feature_speed feature_engine feature_transmission feature_door-icon1 feature_petrol5 feature_hatchback feature_owner feature_paint
0 BMW 5 Series 2012 14250.0 JOS Jack O Sullivan Cars Co. Wexford https://jammer.ie/vehicle/150168-bmw-5-series-2012 103000 miles 2.0 litres Automatic 0 doors Diesel Saloon NaN NaN
1 Citroen C3 2003 1999.0 Somerville Motors Co. Meath https://jammer.ie/vehicle/167272-citroen-c3-2003 74000 miles 1.4 litres Automatic 5 doors Petrol Hatchback 8 previous owners Grey
2 Volkswagen Touareg 2017 29958.0 Holden Motor Company Co. Dublin https://jammer.ie/vehicle/167271-volkswagen-touareg-2017 145000 miles 3.0 litres Automatic 5 doors Diesel SUV 3 previous owners Black
3 Audi A6 2010 7950.0 Roskeen Cars & Commercials Co. Cork https://jammer.ie/vehicle/167270-audi-a6-2010 174921 miles 2.0 litres Manual 5 doors Diesel Estate 1 previous owners Grey
4 Volkswagen Passat 2012 6999.0 Somerville Motors Co. Meath https://jammer.ie/vehicle/167269-volkswagen-passat-2012 150000 miles 1.6 litres Manual 4 doors Diesel Saloon 7 previous owners Silver
5 Vauxhall Insignia 2016 12499.0 ARTHUR AUTO SALES Co. Limerick https://jammer.ie/vehicle/167268-vauxhall-insignia-2016 127000 miles 2.0 litres Manual 5 doors Diesel Hatchback 1 previous owners Black
6 Audi A6 2022 NaN Audi Approved Plus Kilkenny Co. Kilkenny https://jammer.ie/vehicle/167267-audi-a6-2022 932 miles 2.0 litres Automatic NaN Diesel Saloon NaN Grey
7 Volkswagen Polo 2018 14999.0 O Neills Car Sales Co. Meath https://jammer.ie/vehicle/167266-volkswagen-polo-2018 73868 miles 1.0 litres Manual 5 doors Petrol Hatchback 1 previous owners White
8 Audi A1 2022 NaN Audi Approved Plus Kilkenny Co. Kilkenny https://jammer.ie/vehicle/167265-audi-a1-2022 932 miles 1.0 litres Manual NaN Petrol Hatchback NaN Grey
9 Volkswagen Golf 2014 24999.0 O Neills Car Sales Co. Meath https://jammer.ie/vehicle/167263-volkswagen-golf-2014 82523 miles 2.0 litres Manual 3 doors Petrol Hatchback 1 previous owners Black
10 Peugeot 5008 2014 13950.0 PRESTIGE AUTOS Co. Dublin https://jammer.ie/vehicle/167260-peugeot-5008-2014 47000 miles 1.6 litres Automatic 5 doors Petrol MPV 1 previous owners White
11 Audi A4 2014 14950.0 PRESTIGE AUTOS Co. Dublin https://jammer.ie/vehicle/167259-audi-a4-2014 120000 miles 2.0 litres Manual 4 doors Diesel Saloon 2 previous owners White
12 Volkswagen up! 2013 NaN PRESTIGE AUTOS Co. Dublin https://jammer.ie/vehicle/167257-volkswagen-up-2013 90101 miles 1.0 litres Manual 4 doors Petrol Hatchback 3 previous owners White
13 Ford Fiesta 2010 5950.0 Car Options Co. Dublin https://jammer.ie/vehicle/111682-ford-fiesta-2010 113144 miles 1.4 litres Manual 5 doors Diesel Hatchback 4 previous owners Silver
14 Vauxhall Insignia 2015 8950.0 PRESTIGE AUTOS Co. Dublin https://jammer.ie/vehicle/167256-vauxhall-insignia-2015 111849 miles 2.0 litres Manual 0 doors Diesel Hatchback 1 previous owners Black
15 Nissan X-Trail 2016 18950.0 PRESTIGE AUTOS Co. Dublin https://jammer.ie/vehicle/167255-nissan-x-trail-2016 83887 miles 1.6 litres Manual 4 doors Diesel MPV 2 previous owners White
16 Hyundai i10 2016 8950.0 PRESTIGE AUTOS Co. Dublin https://jammer.ie/vehicle/167254-hyundai-i10-2016 59031 miles 1.0 litres Manual 4 doors Petrol Hatchback 4 previous owners Silver
17 Peugeot 3008 2017 23950.0 PRESTIGE AUTOS Co. Dublin https://jammer.ie/vehicle/167253-peugeot-3008-2017 74566 miles 1.6 litres Manual 5 doors Diesel Hatchback 1 previous owners Black
18 Kia Sportage 2019 NaN Kirwan's Co. Wexford https://jammer.ie/vehicle/167252-kia-sportage-2019 56008 miles 1.6 litres Manual 5 doors Diesel MPV 2 previous owners Grey
19 Toyota Corolla 2007 2950.0 LPD Commercials Co. Dublin https://jammer.ie/vehicle/167251-toyota-corolla-2007 115000 miles 1.4 litres Manual 5 doors Petrol Hatchback 3 previous owners Blue
20 Peugeot Partner 2012 5950.0 LPD Commercials Co. Dublin https://jammer.ie/vehicle/167250-peugeot-partner-2012 118000 miles 1.6 litres Manual 5 doors Diesel MPV 3 previous owners Blue
21 Audi A6 2006 2950.0 LPD Commercials Co. Dublin https://jammer.ie/vehicle/167249-audi-a6-2006 130000 miles 2.0 litres Manual 6 doors Diesel Saloon 3 previous owners Black
22 Opel Insignia 2011 4950.0 Blue Diamond Cars Co. Cork https://jammer.ie/vehicle/167248-opel-insignia-2011 153000 miles 2.0 litres Manual 4 doors Diesel Saloon 6 previous owners Black
23 Hyundai Bayon 2023 26545.0 Doran Motors Co. Monaghan https://jammer.ie/vehicle/167247-hyundai-bayon-2023 NaN 1.2 litres Manual 5 doors Petrol SUV NaN Red
24 Volkswagen Jetta 2014 9950.0 Ballincollig Motor Company / Trident Co. Cork https://jammer.ie/vehicle/167246-volkswagen-jetta-2014 112000 miles 1.2 litres Manual 4 doors Petrol Saloon 2 previous owners Black
25 Lexus CT 2011 9950.0 Weirs Motors Co. Dublin https://jammer.ie/vehicle/167245-lexus-ct-2011 139210 miles 1.8 litres Automatic 5 doors NaN Hatchback 4 previous owners White
26 BMW 5 Series 2018 31995.0 Auto Vision Motor Company Co. Dublin https://jammer.ie/vehicle/159883-bmw-5-series-2018 78646 miles 2.0 litres Automatic 4 doors Hybrid Saloon 1 previous owners Grey
27 Kia N/A 2012 6950.0 Ballincollig Motor Company / Trident Co. Cork https://jammer.ie/vehicle/167244-kia-2012 115000 miles 1.6 litres Manual 3 doors Diesel Hatchback 1 previous owners Grey
28 Volkswagen Scirocco 2015 18950.0 Weirs Motors Co. Dublin https://jammer.ie/vehicle/167242-volkswagen-scirocco-2015 34928 miles 2.0 litres Manual 3 doors Diesel Coupe NaN Black
29 Volkswagen Fox 2010 4800.0 Pat & Jason Ryan Co. Waterford https://jammer.ie/vehicle/167241-volkswagen-fox-2010 150095 miles 1.2 litres Manual 3 doors Petrol Hatchback 3 previous owners Grey
30 Peugeot 3008 2019 30950.0 Gowan Motors Co. Dublin https://jammer.ie/vehicle/167240-peugeot-3008-2019 20989 miles 1.5 litres Automatic 5 doors Diesel SUV 1 previous owners Grey
31 Volkswagen Beetle 2017 20950.0 Grange Road Motors Co. Dublin https://jammer.ie/vehicle/167239-volkswagen-beetle-2017 39023 miles 1.2 litres Automatic 3 doors Petrol Hatchback 1 previous owners Red
32 Mercedes-Benz E-Class 2012 13950.0 Car Options Co. Dublin https://jammer.ie/vehicle/167238-mercedes-benz-e-class-2012 119487 miles 2.1 litres Automatic 4 doors Diesel Saloon 2 previous owners Silver
33 Kia Optima 2016 16500.0 BG Motors Ltd Co. Kerry https://jammer.ie/vehicle/167237-kia-optima-2016 86994 miles 1.7 litres Manual 5 doors Diesel Estate 2 previous owners Black
34 Volkswagen Beetle 2007 4950.0 Moloney Motors Co. Dublin https://jammer.ie/vehicle/167236-volkswagen-beetle-2007 103150 miles 1.4 litres Manual 2 doors Petrol Convertible 6 previous owners Blue
35 BMW 5 Series 2019 NaN Cieran McConnon Car Sales Co. Monaghan https://jammer.ie/vehicle/167235-bmw-5-series-2019 34500 miles 2.0 litres Automatic 4 doors Diesel Saloon 1 previous owners Blue
36 Opel Corsa 2011 6999.0 Michael Tynan Cars Co. Dublin https://jammer.ie/vehicle/167234-opel-corsa-2011 59032 miles 1.2 litres Manual 5 doors Petrol Hatchback 3 previous owners Black
37 Ford Focus 2018 21999.0 Michael Tynan Cars Co. Dublin https://jammer.ie/vehicle/167233-ford-focus-2018 31691 miles 1.5 litres Manual 4 doors Diesel Hatchback 2 previous owners Blue
38 Volkswagen Polo 2021 22222.0 Michael Tynan Cars Co. Dublin https://jammer.ie/vehicle/167232-volkswagen-polo-2021 10573 miles 1.0 litres Manual 5 doors Petrol Hatchback 1 previous owners Red