Python - Beautifulsoup - looping divs tags with same class names-CodePudding

I'm wondering how to scrap information off a website where there is multiple elements that have the same identifiers from which I want to scrap price data from. The issue I'm having is that when I loop through each div and print() I see its pasted multiple times in the console. I assume this is du to the div I'm locating encapsulated multiple elements with the same tag classname.

HTML of page

GraphicPrice = soup.findAll('div', class_='col')

for price in GraphicPrice:
    prices = price.find('span', class_='price__amount')
    if prices is None:
        pass
    else:
        print(prices.text)

Output:

£859.99
£859.99
£1,049.99
£1,049.99
£829.99
£829.99
£899.99
£899.99
£999.95
£999.95
£999.95
£999.95

What I want is to eliminate the duplicate information and understand how to refactor my code to prevent this from happening.

Any help would be appreciated. (I'm still leaning) :)

CodePudding user response：

first, you scrap all div tags with "col" class

soup.findAll('div', class_='col')

this div tag has one span class and that one class has two other nested span tags.

so, if you code like this

price.find('span', class_='price__amount')

it scraps two span tags with "price_amount" on each div tag. that represents a wrong class.

if you want a second span tag then your code is like this.

soup.findAll('span', class_='price--sale--colored').find('span', class_='price__amount')

CodePudding user response：

Use .select('span') to get object from inner of the selected object like

GraphicPrice = soup.findAll('div', class_='col')

for price in GraphicPrice:
    prices = price.select('span')