Home > Software engineering >  How do I find and store dynamically loaded elements with python selenium?
How do I find and store dynamically loaded elements with python selenium?

Time:10-30

I am trying to scrape the usernames from the "Followers" button list on this profile, using Python Selenium. I am not able to do this for 2 reasons:

  1. I cant scroll on the list by using driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") because the list has 2 scrollbars (I don't know why it has 2). If I try to scroll it scrolls the profile page and not the actual list.
  2. Even if I manage to scroll the list, how am I supposed to store the usernames? The users are dynamically loaded and for some reason the class id looks like this class='st--c-PJLV st--c-dhzjXW st--c-edagZx'

I've tried several ways of solving this but I'm not able to achieve the result I want, any help is appreciated. Here are some code snippets I tried to use but instead got an error:

scrollElem = driver.find_elements(By.XPATH, "//div[@class='st--c-PJLV st--c-dhzjXW st--c- 
edagZx']/a")
followernumber = 2000
scrollElem[len(scrollElem)-1].location_once_scrolled_into_view
for i in range(0,followernumber):
    new = len(scrollElem) i
    newname = driver.find_element(By.XPATH, "(//div[@class='st--c-PJLV st--c-dhzjXWstedagZx']/a)[%i]"%new)
    print(newname.text, i)
    newname.location_once_scrolled_into_view
    time.sleep(1)

Got the error:selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"(//div[@class='st--c-PJLV st--c-dhzjXW st--c-edagZx']/a)[47]"}

I also tried scrolling at the bottom of the list using this algorithm and store the elements while they load but that didn't work either:

def scrollDown():
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    time.sleep(SCROLL_PAUSE_TIME)
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height

The algorithm scrolled the profile page and not the list of followers

I would appreciate any help as I'm new to web-scraping!

CodePudding user response:

Try using requests module to get all the follower names of that profile:

import requests

link = 'https://hasura2.foundation.app/v1/graphql'
payload = {"query":"query userFollowersQuery($publicKey: String!, $currentUserPublicKey: String!, $offset: Int!, $limit: Int!) {\n  follows: follow(\n    where: {followedUser: {_eq: $publicKey}, isFollowing: {_eq: true}}\n    offset: $offset\n    limit: $limit\n  ) {\n    id\n    user: userByFollowingUser {\n      name\n      username\n      profileImageUrl\n      userIndex\n      publicKey\n      follows(where: {user: {_eq: $currentUserPublicKey}, isFollowing: {_eq: true}}) {\n        createdAt\n        isFollowing\n      }\n    }\n  }\n}\n","variables":{"currentUserPublicKey":"","publicKey":"0xF74d1224931AFa9cf12D06092c1eb1818D1E255C","offset":0,"limit":48},"operationName":"userFollowersQuery"}

with requests.Session() as s:
    s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
    
    while True:
        resp = s.post(link,json=payload)
        if not resp.json()['data']['follows']:break
        for item in resp.json()['data']['follows']:
            print(item['user']['username'])

        payload['variables']['offset'] =48
  • Related