Home > Mobile >  How to access specific item in <li> while web scraping with BeautifulSoup?
How to access specific item in <li> while web scraping with BeautifulSoup?

Time:09-29

I'm scraping from a website with listings for yachts. I want to get specific data points for each yacht. I'm using beautiful soup and selenium.

When I use:

for i in soup.find_all('li',{'class':'col-xs-12 col-sm-6 col-xl-4 padding10'}):
     print(i)

The following is returned (one for each yacht listing, so what I'm showing here is just one of many):

<li class="col-xs-12 col-sm-6 col-xl-4 padding10" data-bareboat_icon="flaticon-pilot" data-bareboat_text="Captained" data-boat_builder="Cruiser Yacht" data-boat_headline="Bachelorette! Family celebration! Fun and Adventure Awaits ! 40' Cruisers Yacht." data-boat_id="7907" data-boat_model="Cruiser Yacht" data-boat_year="2007" data-guests="12" data-latitude="25.790300369263" data-length="40" data-location="Miami" data-longitude="-80.184196472168" data-picurl="https://d18mr9iuob0gar.cloudfront.net/media/boats/2020/02/thumb-rental-Motor-boat-Cruiser_Yacht-40feet-bMiami-bFL_XWTaTez.jpg" data-price="800" data-url="/boats/Miami_FL/Motor/rental_boat_7907/" id="boatThumb27907">

From the data above, I want to scrape the 'data-price', 'data-boat_year', 'data-guests', and a couple of other items. I'm having trouble scraping these particular items. For example, to get the price ('data-price'), I try the following:

for i in soup.find_all('li',{'class':'col-xs-12 col-sm-6 col-xl-4 padding10'}):
     price=i.find('data-price')
     print(price)

But that returns "None". What am I doing wrong? Everything I try returns "None".

CodePudding user response:

find() is looks for a tag name, however, price and the other data you want are attributes.

You can access them as follows:

for i in soup.find_all("li", {"class": "col-xs-12 col-sm-6 col-xl-4 padding10"}):
    print(i["data-price"])
    print(i["data-boat_year"])
  • Related