Home > OS >  Why BS4 do not find a an element by its class even though it is present in HTML?
Why BS4 do not find a an element by its class even though it is present in HTML?

Time:09-03

I want to scrape the information of all the cards on this enter image description here

My approach :

import pandas as pd 
import numpy as np
import requests
import json
from bs4 import BeautifulSoup
url1 = "https://zerotomastery.io/testimonials/"
res = requests.get(url1)
blog_data = []
if (res.status_code == 200):
    page = BeautifulSoup(res.content , "html.parser")
    print(page.find("div" , {"class" : "divcomponent__Div-sc-hnfdyq-0 base-cardstyles__BaseCard-sc-1eokxla-0 testimonial-cardstyles__TestimonialCard-sc-137v3r9-0  dRXcRh ipQTEw"}))

enter image description here

As you can cleary see that the class is present.

CodePudding user response:

YOu are searching for a very specific 'class' which appears to be dynamically created. A better option is to use something a little more general sub string found in those classes, such as "TestimonialCard".

import pandas as pd 
import numpy as np
import requests
import json
from bs4 import BeautifulSoup
import re

url1 = "https://zerotomastery.io/testimonials/"
res = requests.get(url1)
rows = []
if (res.status_code == 200):
    page = BeautifulSoup(res.content , "html.parser")
    testCards = page.find_all("div" , {"class" : re.compile('.*TestimonialCard.*')})

    for card in testCards:
        name = card.find('h2').text
        job = card.find('span').text
        company = card.find('img', {'class':re.compile('.*CompanyImage.*')})['alt']
        test = card.find('p').text

        row = {
            'name':name,
            'job':job,
            'company':company,
            'testimonial':test}
        
        rows.append(row)

I simply didn't have time to search through the nested json to pull out the part you were asking for, but it's somewhere in there.

Output:

print(df)
                          name  ...                                        testimonial
0                   Olga Fomin  ...  I was asked a lot of security questions at my ...
1                   Justin Lin  ...  Andrei is one of the best teachers & his cours...
2                  Karan Verma  ...  Andrei’s course helped me to ace my Frontend E...
3                Damon Clemons  ...  I want to thank Andrei, Yihua and the entire Z...
4                    Adil Asif  ...  A year ago I couldn't write an app or put up a...
5               Haidarali Shah  ...  I have landed myself a job at UNIQLO thanks to...
6                  Adam Szwaba  ...  I GOT HIRED! Thanks to Andrei Neagoie and ever...
7              Methkal Khalawi  ...  Glad to tell you I got a job at Google Cloud t...
8                Caroline Chan  ...  I can't recommend Andrei's courses and the ZTM...
9               Aradhya Bansal  ...  ZTM was the key stepping stone to building my ...
10                 Joy Goh-Mah  ...  I’ve been offered my first Web Developer job w...
11              Faiz-ur Rahman  ...  I just started at Blizzard as an Associate Sof...
12                Jonathan Sou  ...  Without a doubt in my mind, taking Andrei’s co...
13                   Anca Toma  ...  It’s only my 3rd day as a Software Developer b...
14             Răzvan Cîrlugea  ...  All because of Andrei and his courses, I’ve be...
15                 Ruben Marin  ...  Thanks to @AndreiNeagoie's course, I was able ...
16                Sheel Parekh  ...  I am largely grateful to the ZTM community and...
17          Dajana Stojchevska  ...  Andrei, I want to thank you from the bottom of...
18                  Rafay Syed  ...  After going through Andrei’s course on studyin...
19              Zans Litinskis  ...  In 2018, I landed my first dev job after takin...
20            Chandler Baskins  ...  After 7 months & 27 days of long nights, sacri...
21           Leonardo Escudero  ...  Thank you for creating this amazing platform! ...
22              Swagath Shetty  ...  Just got my first job as a junior software dev...
23             Kristian Rykkje  ...  After just 4 months non-stop working with your...
24               Tyler Sustare  ...  Thank you! I wanted to tell you that within we...
25          Gazi Md. Shahnewaz  ...  Andrei’s courses helped me to not only land my...
26                Andrew Price  ...  I'm a Software Engineer! Unemployed just after...
27                Ankit Salian  ...  I recently switched jobs due to the Covid-19 p...
28               Gurprit Singh  ...  Can't thank Andrei & Yihua enough. Their React...
29             Mauro Rodrigues  ...  No college degree. No programming experience. ...
30              Gaëtan Herfray  ...  Are the ZTM coding interview prep courses wort...
31            Riccardo Colombo  ...  I actually got a job! Thanks Zero To Mastery! ...
32                David Nanyan  ...  I started your course in 2018. Thanks to you, ...
33            Nicolas Giaconia  ...  I used to read people’s success stories thinki...
34                  Umer Azhar  ...  Andrei - I got hired as a Software Developer! ...
35               Gabriel Petre  ...  Only 4 months after I finished Andrei's course...
36                 David Nowak  ...  Without a doubt, several ZTM courses played a ...
37         Alessandro Lamberti  ...  Andrei and Daniel gave me everything I needed ...
38               Catalin Tugui  ...  1 year into programming after starting from ze...
39            Jasim Zainudheen  ...  I landed in Munich, Germany with a job offer a...
40               Ferenc Gulyás  ...  I wanted to advance my career but was missing ...
41               Igor N Houwat  ...  Andrei's web dev courses were a major reason I...
42               Carlos Guinto  ...  ZTM courses helped me land my 1st developer jo...
43           Ben Smitthimedhin  ...  I can honestly say that ZTM courses built the ...
44                 Shaine Tsou  ...  I got an offer as a software engineer in just ...
45                 Eshan Raina  ...  Can’t stress enough how helpful ZTM’s Coding I...
46              Riel St. Amand  ...  I've been a dev for 2 years. I was just promot...
47                  Chris Sean  ...  I believe nothing beats @zerotomasteryio when ...
48  Chakradhar Reddy Yerragudi  ...  I would like to thank Andrei for his awesome c...
49            Juan Pablo Rubio  ...  2 months of ZTM   2 months working on a person...
50                    Theo Tam  ...  Two years ago, my family had financial difficu...
51            Ashish Agnihotri  ...  I am so glad that I got my first job by studyi...
52               Payton Pierce  ...  Andrei and ZTM are 100% responsible for where ...
53                Jiel Selmani  ...  I'm a proud lifetime member of ZTM. It has lit...

[54 rows x 4 columns]
            
    df = pd.DataFrame(rows)

CodePudding user response:

To get name, title and text from the cards you can use following example:

import requests
from bs4 import BeautifulSoup


url = "https://zerotomastery.io/testimonials/"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

for h2 in soup.select("h2")[1:]:
    name = h2.text
    title = h2.find_next("span").text
    text = h2.find_next("p").text
    print(name)
    print(title)
    print(text)
    print("-" * 80)

Prints:


...

Payton Pierce
React Developer
Andrei and ZTM are 100% responsible for where I am today - 2 years into working professionally as a developer with zero prior experience. I recommend ZTM to everyone I meet who shows an interest in getting into tech. It's the best resource out there!
--------------------------------------------------------------------------------
Jiel Selmani
Software Engineer
I'm a proud lifetime member of ZTM. It has literally changed the trajectory of my life. The projects I was able to build from what I learned in ZTM courses is what got me my job. Thank you Andrei, Adam, & Yihua for helping me standout as the #1 candidate.
--------------------------------------------------------------------------------

CodePudding user response:

as you can cleary see that the class is present, can you please tell me why my code is not working ??

Class is present for sure, but your code is not working, cause there is a typo / additional whitespace in your classes 137v3r9-0 dRXcRh.


It is a good strategy to avoid dynamic classes for element selection and use more static things like id or HTML structure.

Select your cards based on its HTML structure by css selector:

soup.select('div:has(>h2 span)')

Itereate your ResultSet and simply pick your information, again just by structure and append it as dict to your list:

for card in soup.select('div:has(>h2 span)'):
    data.append({
        'name':card.h2.text,
        'title': card.span.text,
        'text': card.p.text,
        'url': card.a.get('href')
    })

Finaly create a DataFrame from your list:

pd.DataFrame(data)

Example

import requests
import pandas as pd
from bs4 import BeautifulSoup

soup = BeautifulSoup(requests.get('https://zerotomastery.io/testimonials/').content)
data = []

for card in soup.select('div:has(>h2 span)'):
    data.append({
        'name':card.h2.text,
        'title': card.span.text,
        'text': card.p.text,
        'url': card.a.get('href')
    })

pd.DataFrame(data)

Output

name title text url
0 Olga Fomin Software Engineer I was asked a lot of security questions at my interview with Tesla and I was able to answer them because of Andrei’s course: Complete Junior to Senior Web Developer Roadmap. I would recommend his courses to anyone who wants to learn web dev inside and out. https://www.linkedin.com/in/olgafomin
1 Justin Lin Software Engineer Andrei is one of the best teachers & his courses were a big reason I was able to get internships at both JP Morgan and Amazon. 2020 UPDATE: I just got an offer from Amazon as a full-time engineer. I couldn't have done it without the foundations from ZTM. https://www.linkedin.com/in/justinlinw/
2 Karan Verma Software Engineer Andrei’s course helped me to ace my Frontend Engineer interviews at places like Uber, Amazon India and Gojek. Can't thank you and the ZTM community enough! https://www.linkedin.com/in/karanisverma
3 Damon Clemons Software Engineer I want to thank Andrei, Yihua and the entire ZTM community for building such an amazing platform for people like me. I went from not believing much in myself to having the foundation to be a great engineer and JUST got an offer from Google! Thank you! https://www.linkedin.com/in/damon-clemons-1b63426a/
4 Adil Asif Senior Software Engineer A year ago I couldn't write an app or put up a website. Now, I've started a new career as a Web Developer thanks to you, your courses, your advice, and your posts. Thank you! https://www.linkedin.com/in/adilaasif/

...

  • Related