Home > Enterprise >  Selenium only find few elements
Selenium only find few elements

Time:04-11

I want to make a recommendation system for webtoon, so I am collecting webtoon data. Currently, I wrote a code to scrap the url of the toons on the Kakao Webtoon page.

def extract_from_page(page_link):
    links = []
    driver = webdriver.Chrome()
    driver.get(page_link)
    
    elems = driver.find_elements_by_css_selector(".h-full.relative")
    for elem in elems:
        link = elem.get_attribute('href')
        if link:
            links.append({'id': int(link.split('/')[-1]), 'link': link})
    
    print(len(links))
    return links

This code works in weekly page(https://webtoon.kakao.com/original-webtoon, https://webtoon.kakao.com/original-novel)

However, in page that shows finished toons(https://webtoon.kakao.com/original-webtoon?tab=complete), it only receives 13 urls for the 13 webtoons at the top of the page.

I found similar post(web scraping gives only first 4 elements on a page) and add scroll, but noting changed.

I would appreciate it if you could tell me the cause and solution.

CodePudding user response:

Try like below.

driver.get("https://webtoon.kakao.com/original-webtoon?tab=complete")

wait = WebDriverWait(driver,30)

j = 1
for i in range(5):
    # Wait for the elements to load/appear
    wait.until(EC.presence_of_all_elements_located((By.XPATH, "//a[contains(@href,'content')]")))
    # Get all the elements which contains href value
    links = driver.find_elements(By.XPATH,"//a[contains(@href,'content')]")
    # Iterate to print the links
    for link in links:
        print(f"{j} : {link.get_attribute('href')}")
        j  = 1
    # Scroll to the last element of the list links
    driver.execute_script("arguments[0].scrollIntoView(true);",links[len(links)-1])

Output:

1 : https://webtoon.kakao.com/content/밤의-향/1532
2 : https://webtoon.kakao.com/content/브레이커2/596
3 : https://webtoon.kakao.com/content/토이-콤플렉스/1683
...
  • Related