Home > OS >  Why soup.find_all("a", class_="title") leads to empty ResultSet?
Why soup.find_all("a", class_="title") leads to empty ResultSet?

Time:01-22

from bs4 import BeautifulSoup as bs

import requests
import pandas as pd

url= "https://www.flipkart.com/health-personal-care-appliances/~cs-zmpfqx4vyj/pr?sid=zlw&collection-tab-name=Hair Dryers&bu=SHOPSY&hpid=ykaudr-vM9cXBQ-KYvhUJap7_Hsxr70nj65vMAAFKlc=&ctx=eyJjYXJkQ29udGV4dCI6eyJhdHRyaWJ1dGVzIjp7InZhbHVlQ2FsbG91dCI6eyJtdWx0aVZhbHVlZEF0dHJpYnV0ZSI6eyJrZXkiOiJ2YWx1ZUNhbGxvdXQiLCJpbmZlcmVuY2VUeXBlIjoiVkFMVUVfQ0FMTE9VVCIsInZhbHVlcyI6WyJGcm9tIOKCuTQ5OSJdLCJ2YWx1ZVR5cGUiOiJNVUxUSV9WQUxVRUQifX0sImhlcm9QaWQiOnsic2luZ2xlVmFsdWVBdHRyaWJ1dGUiOnsia2V5IjoiaGVyb1BpZCIsImluZmVyZW5jZVR5cGUiOiJQSUQiLCJ2YWx1ZSI6IkhEUkVVR04zNFhaTUJQNk4iLCJ2YWx1ZVR5cGUiOiJTSU5HTEVfVkFMVUVEIn19LCJ0aXRsZSI6eyJtdWx0aVZhbHVlZEF0dHJpYnV0ZSI6eyJrZXkiOiJ0aXRsZSIsImluZmVyZW5jZVR5cGUiOiJUSVRMRSIsInZhbHVlcyI6WyJCZXN0IG9mIEhhaXIgRHJ5ZXJzIl0sInZhbHVlVHlwZSI6Ik1VTFRJX1ZBTFVFRCJ9fX19fQ==&fm=neo/merchandising&iid=M_e8e2c26a-bc30-4b33-9b6c-10da733feccc_3.VXQGYLXW75FQ&ssid=kf3v0o2x740000001674365574944&otracker=hp_omu_Best+of+Electronics_5_3.dealCard.OMU_VXQGYLXW75FQ_3&otracker1=hp_omu_PINNED_neo/merchandising_Best+of+Electronics_NA_dealCard_cc_5_NA_view-all_3&cid=VXQGYLXW75FQ"

link = requests.get(url)

soup = bs(link.content, "html.parser")

soup.find_all("a","title")

While performing this code soup.find_all("a","title") no output is coming. simply it shows [] only. Wherever if i try soup.find_all("title") this it shows only first name of the product. Since i'm a learner im not able to understand whats happening here. Could anybody help me to understand this.

CodePudding user response:

It is because the second parameter isn't meant to look for attribute names it is meant to look for attribute values. As an alternative you can simply iterate the list of 'a' tags and print the ones with title attributes.

from bs4 import BeautifulSoup as bs
import requests

url= "https://www.flipkart.com/health-personal-care-appliances/~cs-zmpfqx4vyj/pr?sid=zlw&collection-tab-name=Hair Dryers&bu=SHOPSY&hpid=ykaudr-vM9cXBQ-KYvhUJap7_Hsxr70nj65vMAAFKlc=&ctx=eyJjYXJkQ29udGV4dCI6eyJhdHRyaWJ1dGVzIjp7InZhbHVlQ2FsbG91dCI6eyJtdWx0aVZhbHVlZEF0dHJpYnV0ZSI6eyJrZXkiOiJ2YWx1ZUNhbGxvdXQiLCJpbmZlcmVuY2VUeXBlIjoiVkFMVUVfQ0FMTE9VVCIsInZhbHVlcyI6WyJGcm9tIOKCuTQ5OSJdLCJ2YWx1ZVR5cGUiOiJNVUxUSV9WQUxVRUQifX0sImhlcm9QaWQiOnsic2luZ2xlVmFsdWVBdHRyaWJ1dGUiOnsia2V5IjoiaGVyb1BpZCIsImluZmVyZW5jZVR5cGUiOiJQSUQiLCJ2YWx1ZSI6IkhEUkVVR04zNFhaTUJQNk4iLCJ2YWx1ZVR5cGUiOiJTSU5HTEVfVkFMVUVEIn19LCJ0aXRsZSI6eyJtdWx0aVZhbHVlZEF0dHJpYnV0ZSI6eyJrZXkiOiJ0aXRsZSIsImluZmVyZW5jZVR5cGUiOiJUSVRMRSIsInZhbHVlcyI6WyJCZXN0IG9mIEhhaXIgRHJ5ZXJzIl0sInZhbHVlVHlwZSI6Ik1VTFRJX1ZBTFVFRCJ9fX19fQ==&fm=neo/merchandising&iid=M_e8e2c26a-bc30-4b33-9b6c-10da733feccc_3.VXQGYLXW75FQ&ssid=kf3v0o2x740000001674365574944&otracker=hp_omu_Best+of+Electronics_5_3.dealCard.OMU_VXQGYLXW75FQ_3&otracker1=hp_omu_PINNED_neo/merchandising_Best+of+Electronics_NA_dealCard_cc_5_NA_view-all_3&cid=VXQGYLXW75FQ"

link = requests.get(url)
soup = bs(link.content, "html.parser")

for i in soup.find_all("a"):
    if i.get("title"):
        print(i["title"])

OUTPUT

Health & Personal Care Appliances
Personal Care Appliances
PHILIPS HP8100/46 Hair Dryer
NOVA NHP 8100/05 Hair Dryer
HAVELLS HD2222 Hair Dryer
HAVELLS HD1901 Hair Dryer
HAVELLS HD3181 1600 W Cool Shot Hair Dryer Hair Dryer
HAVELLS HD3151 Hair Dryer
PHILIPS HP8100/60 Hair Dryer
Choaba Hair Dryer (CHAOBA 2800) 2000 Watts for Hair Styling with Cool and Hot Air Flow Option (Black) Hair Dryer
Syska HD1600 Hair Dryer
Syska HD1010 Hair Dryer
PHILIPS HP8643/46 Hair Straightener   Hair Dryer
PHILIPS HP8120/00 Hair Dryer
VEGA VHDH-20N Hair Dryer
PHILIPS BHC017/00 Hair Dryer
realme RMH2015 Hair Dryer
PHILIPS BHD356/10 2100W Thermoprotect AirFlower Advanced Care 6 Heat & Speed Settings (Black) Hair Dryer
VEGA U-Style 1600 Foldable Hair Dryer For Men & Women With Cool Shot Button(VHDH-24) Hair Dryer
VEGA VHDP-02 Hair Dryer
HAVELLS hd3276 Hair Dryer
VEGA VHDH-19 Hair Dryer
Revlon RVDR5229IN Hair Dryer

CodePudding user response:

Use it as a keyword argument for the attribute in this way:

soup.find_all('a',title=True)

but this will extract also other <a>.

Select your elements more specififc and in shorter way using css selectors, so this one would only select these <a> that are direct siblings of an <a> and have a title attribute:

for i in soup.select('a   a[title]'):
    print(i.get('title'))

Example

from bs4 import BeautifulSoup as bs
import requests

url= "https://www.flipkart.com/health-personal-care-appliances/~cs-zmpfqx4vyj/pr?sid=zlw&collection-tab-name=Hair Dryers&bu=SHOPSY&hpid=ykaudr-vM9cXBQ-KYvhUJap7_Hsxr70nj65vMAAFKlc=&ctx=eyJjYXJkQ29udGV4dCI6eyJhdHRyaWJ1dGVzIjp7InZhbHVlQ2FsbG91dCI6eyJtdWx0aVZhbHVlZEF0dHJpYnV0ZSI6eyJrZXkiOiJ2YWx1ZUNhbGxvdXQiLCJpbmZlcmVuY2VUeXBlIjoiVkFMVUVfQ0FMTE9VVCIsInZhbHVlcyI6WyJGcm9tIOKCuTQ5OSJdLCJ2YWx1ZVR5cGUiOiJNVUxUSV9WQUxVRUQifX0sImhlcm9QaWQiOnsic2luZ2xlVmFsdWVBdHRyaWJ1dGUiOnsia2V5IjoiaGVyb1BpZCIsImluZmVyZW5jZVR5cGUiOiJQSUQiLCJ2YWx1ZSI6IkhEUkVVR04zNFhaTUJQNk4iLCJ2YWx1ZVR5cGUiOiJTSU5HTEVfVkFMVUVEIn19LCJ0aXRsZSI6eyJtdWx0aVZhbHVlZEF0dHJpYnV0ZSI6eyJrZXkiOiJ0aXRsZSIsImluZmVyZW5jZVR5cGUiOiJUSVRMRSIsInZhbHVlcyI6WyJCZXN0IG9mIEhhaXIgRHJ5ZXJzIl0sInZhbHVlVHlwZSI6Ik1VTFRJX1ZBTFVFRCJ9fX19fQ==&fm=neo/merchandising&iid=M_e8e2c26a-bc30-4b33-9b6c-10da733feccc_3.VXQGYLXW75FQ&ssid=kf3v0o2x740000001674365574944&otracker=hp_omu_Best+of+Electronics_5_3.dealCard.OMU_VXQGYLXW75FQ_3&otracker1=hp_omu_PINNED_neo/merchandising_Best+of+Electronics_NA_dealCard_cc_5_NA_view-all_3&cid=VXQGYLXW75FQ"
soup = bs(requests.get(url).content)

for i in soup.select('a   a[title]'):
        print(i.get('title'))

Output

PHILIPS HP8100/46 Hair Dryer
NOVA NHP 8100/05 Hair Dryer
HAVELLS HD2222 Hair Dryer
HAVELLS HD1901 Hair Dryer
HAVELLS HD3181 1600 W Cool Shot Hair Dryer Hair Dryer
PHILIPS HP8100/60 Hair Dryer
Syska HD1600 Hair Dryer
Syska HD1010 Hair Dryer
...
  • Related