Home > Enterprise >  Unable to pull text from an element using Beautiful Soup in Python
Unable to pull text from an element using Beautiful Soup in Python

Time:11-08

I am trying to pull the text (see image of element tree) of an element in the trust pilot website using the following script but all it returns is a bunch of 'None'.

url = "https://uk.trustpilot.com/review/rockar.com"
try_url = requests.get(url)

soup = BeautifulSoup(try_url.content, 'html.parser')
print(try_url.content)

for h in soup.find_all('div', {'class': 'styles_reviewContent__3TSDf'}):
    hdln = h.find("h2")
    print(hdln)

What is the way around this? am I looking at the wrong selector?

enter image description here

CodePudding user response:

As @diggusbickus pointed out, you can get the reviews this way:

data = json.loads(soup.find('script', type='application/json').string)
reviews = data["props"]["pageProps"]["reviews"]

sample_reply = reviews[0]["reply"]

The sample_reply is

{'message': "Thank you so much for your kind words, Fi! It's great to hear Shah was fantastic and offer a personal service to your car buying journey. Thank you for taking the time to leave us a great review! We hope you love your new vehicle! Thanks again for choosing Rockar :-)",
'publishedDate': '2021-11-04T12:35:25.401Z',
'updatedDate': '2021-11-04T12:35:34.948Z'}
  • Related