Home > Mobile >  How to extract attribute inside class (?) with beautifulsoup?
How to extract attribute inside class (?) with beautifulsoup?

Time:04-28

I've been trying all day to get the value inside 'data-tippy-content' from:

<span >
  <span  data-tippy-content="Niacinamide"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/niacinamide-def098d8bdadc1b0e2f848d8e77e6ca5d744099553ff816b09d4f3521abe0bc0.png"></span>
  <span  data-tippy-content="Good for Oily Skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_oil-3587634ebcb8ef7c01c9959a4cbcbc2d85f2990a9216d24ec1081f1a931c4264.png"></span>
  <span  data-tippy-content="Helps reduce Skin Redness"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_redness-8684f0660c6620202a754fd9bd20a0378f89d43fe2cb0fdc19a4b26af1f0b9b6.png"></span><span  data-tippy-content="Helps fight Acne"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_acne-82d0b5bc5978478dd67def5e721767e35db9523db64e7756343c95bfb99f842f.png"></span>
  <span  data-tippy-content="Helps brighten skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_brightening-ea550c7d84d6517a76f19bab645a747913a49fe0a565f177cfa1b73f20a89d21.png"></span>
</span>

I've tried a lot of things but still :")

I have the following code (THe problem is in the 'effect'):

items = soup.findAll('div', 'relative flex flex-col min-w-0 rounded-2xl break-words border-2 border-blue-gray-100 h-full lg:h-72')
ef_list = []
for i in items:
    name = i.find('h4', 'mb-1 leading-tight text-blue-gray-700 group-hover:underline font-header text-xl font-bold hover:underline whitespace-normal overflow-hidden').text.strip()
    function  = i.find('p', 'mb-0 text-sm').text.strip()
    effect = i.find('span', 'opacity-70').find('span')['data-tippy-content']
    print(effect)
    data.append([name, function,effect])

I expect the output to be something like:

"Niacinamide, Good for Oily Skin, Helps reduce Skin Redness, Helps brighten skin"

but instead:

Traceback (most recent call last):
  File "d:\Scrape\scrape.py", line 36, in <module>
    print(effect['data-tippy-content'])
  File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\bs4\element.py", line 1519, in __getitem__
    return self.attrs[key]
KeyError: 'data-tippy-content'

I've no idea how to get those "data-tippy-content" (I'm new at this)

here is the css for the inspect: HTML

CodePudding user response:

The value that you want to pull in attribute that's data-tippy-content .So simply you can call get('data-tippy-content) to get desired data

html = '''
<span >
  <span  data-tippy-content="Niacinamide"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/niacinamide-def098d8bdadc1b0e2f848d8e77e6ca5d744099553ff816b09d4f3521abe0bc0.png"></span>
  <span  data-tippy-content="Good for Oily Skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_oil-3587634ebcb8ef7c01c9959a4cbcbc2d85f2990a9216d24ec1081f1a931c4264.png"></span>
  <span  data-tippy-content="Helps reduce Skin Redness"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_redness-8684f0660c6620202a754fd9bd20a0378f89d43fe2cb0fdc19a4b26af1f0b9b6.png"></span><span  data-tippy-content="Helps fight Acne"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_acne-82d0b5bc5978478dd67def5e721767e35db9523db64e7756343c95bfb99f842f.png"></span>
  <span  data-tippy-content="Helps brighten skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_brightening-ea550c7d84d6517a76f19bab645a747913a49fe0a565f177cfa1b73f20a89d21.png"></span>
</span>

'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(html,'html.parser')

txt=[x.get('data-tippy-content') for x in soup.select('.opacity-70 span')]
print(txt)

Output:

['Niacinamide', 'Good for Oily Skin', 'Helps reduce Skin Redness', 'Helps fight Acne', 'Helps 
brighten skin']
  • Related