I've been trying all day to get the value inside 'data-tippy-content' from:
<span >
<span data-tippy-content="Niacinamide"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/niacinamide-def098d8bdadc1b0e2f848d8e77e6ca5d744099553ff816b09d4f3521abe0bc0.png"></span>
<span data-tippy-content="Good for Oily Skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_oil-3587634ebcb8ef7c01c9959a4cbcbc2d85f2990a9216d24ec1081f1a931c4264.png"></span>
<span data-tippy-content="Helps reduce Skin Redness"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_redness-8684f0660c6620202a754fd9bd20a0378f89d43fe2cb0fdc19a4b26af1f0b9b6.png"></span><span data-tippy-content="Helps fight Acne"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_acne-82d0b5bc5978478dd67def5e721767e35db9523db64e7756343c95bfb99f842f.png"></span>
<span data-tippy-content="Helps brighten skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_brightening-ea550c7d84d6517a76f19bab645a747913a49fe0a565f177cfa1b73f20a89d21.png"></span>
</span>
I've tried a lot of things but still :")
I have the following code (THe problem is in the 'effect'):
items = soup.findAll('div', 'relative flex flex-col min-w-0 rounded-2xl break-words border-2 border-blue-gray-100 h-full lg:h-72')
ef_list = []
for i in items:
name = i.find('h4', 'mb-1 leading-tight text-blue-gray-700 group-hover:underline font-header text-xl font-bold hover:underline whitespace-normal overflow-hidden').text.strip()
function = i.find('p', 'mb-0 text-sm').text.strip()
effect = i.find('span', 'opacity-70').find('span')['data-tippy-content']
print(effect)
data.append([name, function,effect])
I expect the output to be something like:
"Niacinamide, Good for Oily Skin, Helps reduce Skin Redness, Helps brighten skin"
but instead:
Traceback (most recent call last):
File "d:\Scrape\scrape.py", line 36, in <module>
print(effect['data-tippy-content'])
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\bs4\element.py", line 1519, in __getitem__
return self.attrs[key]
KeyError: 'data-tippy-content'
I've no idea how to get those "data-tippy-content" (I'm new at this)
here is the css for the inspect: HTML
CodePudding user response:
The value that you want to pull in attribute that's data-tippy-content
.So simply you can call get('data-tippy-content)
to get desired data
html = '''
<span >
<span data-tippy-content="Niacinamide"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/niacinamide-def098d8bdadc1b0e2f848d8e77e6ca5d744099553ff816b09d4f3521abe0bc0.png"></span>
<span data-tippy-content="Good for Oily Skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_oil-3587634ebcb8ef7c01c9959a4cbcbc2d85f2990a9216d24ec1081f1a931c4264.png"></span>
<span data-tippy-content="Helps reduce Skin Redness"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_redness-8684f0660c6620202a754fd9bd20a0378f89d43fe2cb0fdc19a4b26af1f0b9b6.png"></span><span data-tippy-content="Helps fight Acne"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_acne-82d0b5bc5978478dd67def5e721767e35db9523db64e7756343c95bfb99f842f.png"></span>
<span data-tippy-content="Helps brighten skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_brightening-ea550c7d84d6517a76f19bab645a747913a49fe0a565f177cfa1b73f20a89d21.png"></span>
</span>
'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(html,'html.parser')
txt=[x.get('data-tippy-content') for x in soup.select('.opacity-70 span')]
print(txt)
Output:
['Niacinamide', 'Good for Oily Skin', 'Helps reduce Skin Redness', 'Helps fight Acne', 'Helps
brighten skin']