I'm scraping from a website with listings for yachts. I want to get specific data points for each yacht. I'm using beautiful soup and selenium.
When I use:
for i in soup.find_all('li',{'class':'col-xs-12 col-sm-6 col-xl-4 padding10'}):
print(i)
The following is returned (one for each yacht listing, so what I'm showing here is just one of many):
<li class="col-xs-12 col-sm-6 col-xl-4 padding10" data-bareboat_icon="flaticon-pilot" data-bareboat_text="Captained" data-boat_builder="Cruiser Yacht" data-boat_headline="Bachelorette! Family celebration! Fun and Adventure Awaits ! 40' Cruisers Yacht." data-boat_id="7907" data-boat_model="Cruiser Yacht" data-boat_year="2007" data-guests="12" data-latitude="25.790300369263" data-length="40" data-location="Miami" data-longitude="-80.184196472168" data-picurl="https://d18mr9iuob0gar.cloudfront.net/media/boats/2020/02/thumb-rental-Motor-boat-Cruiser_Yacht-40feet-bMiami-bFL_XWTaTez.jpg" data-price="800" data-url="/boats/Miami_FL/Motor/rental_boat_7907/" id="boatThumb27907">
From the data above, I want to scrape the 'data-price', 'data-boat_year', 'data-guests', and a couple of other items. I'm having trouble scraping these particular items. For example, to get the price ('data-price'), I try the following:
for i in soup.find_all('li',{'class':'col-xs-12 col-sm-6 col-xl-4 padding10'}):
price=i.find('data-price')
print(price)
But that returns "None". What am I doing wrong? Everything I try returns "None".
CodePudding user response:
find()
is looks for a tag name, however, price
and the other data you want are attributes.
You can access them as follows:
for i in soup.find_all("li", {"class": "col-xs-12 col-sm-6 col-xl-4 padding10"}):
print(i["data-price"])
print(i["data-boat_year"])