Home > Software engineering >  python beautifulsoup4 selenium ChromeDriverManager [Webpage crawling is not working until the end]
python beautifulsoup4 selenium ChromeDriverManager [Webpage crawling is not working until the end]

Time:10-15

I'm trying to get the coin name list from the webpage. I've tried with soup but didn't work for some reasons. and also tried to use the selenium as well. :( but not working either.

What is the problem with that web site? (I've found that the javascript & DOM issue? but couldn't understand clearly..) could I get some help to get the all list from the web? (I've use the Chrome driver manager to avoid some errors)

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from bs4 import BeautifulSoup
from selenium.webdriver.common.keys import Keys
import time

options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-logging"])
driver = webdriver.Chrome(ChromeDriverManager().install(),options=options)

html = driver.get('https://coinmarketcap.com/')
html = driver.page_source

driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")
driver.maximize_window()
driver.implicitly_wait(10)

soup = BeautifulSoup(html, 'html.parser')

status_today = soup.find_all('div',{'class':'sc-16r8icm-0 escjiH'},'href')

for x in status_today:
    print('x.a[href]=',x.a['href'])

The results contains 10 lines only, there are 100 coin lists...

x.a[href]= /currencies/bitcoin/
x.a[href]= /currencies/ethereum/
x.a[href]= /currencies/binance-coin/
x.a[href]= /currencies/cardano/
x.a[href]= /currencies/tether/
x.a[href]= /currencies/xrp/
x.a[href]= /currencies/solana/
x.a[href]= /currencies/polkadot-new/
x.a[href]= /currencies/usd-coin/
x.a[href]= /currencies/dogecoin/

CodePudding user response:

You need to scroll to each element and then you can extract the href out of the anchor tag.

Also make sure to use Explicit waits.

xpath that we are using is //tbody//tr with indexing.

Code :

driver = webdriver.Chrome(driver_path)
driver.maximize_window()
driver.implicitly_wait(30)
wait = WebDriverWait(driver, 30)

driver.get("https://coinmarketcap.com/")

j = 1
while True:
    try:
        row  = wait.until(EC.visibility_of_element_located((By.XPATH, f"(//tbody//tr)[{j}]")))
        driver.execute_script("arguments[0].scrollIntoView(true);", row)
        href = row.find_element_by_xpath(".//descendant::div[@class='sc-16r8icm-0 escjiH']//a").get_attribute('href')
        print(href)
        j = j  1
    except:
        break

Imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Output :

https://coinmarketcap.com/currencies/bitcoin/
https://coinmarketcap.com/currencies/ethereum/
https://coinmarketcap.com/currencies/binance-coin/
https://coinmarketcap.com/currencies/cardano/
https://coinmarketcap.com/currencies/tether/
https://coinmarketcap.com/currencies/xrp/
https://coinmarketcap.com/currencies/solana/
https://coinmarketcap.com/currencies/polkadot-new/
https://coinmarketcap.com/currencies/usd-coin/
https://coinmarketcap.com/currencies/dogecoin/
https://coinmarketcap.com/currencies/terra-luna/
https://coinmarketcap.com/currencies/uniswap/
https://coinmarketcap.com/currencies/binance-usd/
https://coinmarketcap.com/currencies/avalanche/
https://coinmarketcap.com/currencies/litecoin/
https://coinmarketcap.com/currencies/wrapped-bitcoin/
https://coinmarketcap.com/currencies/shiba-inu/
https://coinmarketcap.com/currencies/chainlink/
https://coinmarketcap.com/currencies/bitcoin-cash/
https://coinmarketcap.com/currencies/algorand/
https://coinmarketcap.com/currencies/polygon/
https://coinmarketcap.com/currencies/stellar/
https://coinmarketcap.com/currencies/filecoin/
https://coinmarketcap.com/currencies/cosmos/
https://coinmarketcap.com/currencies/internet-computer/
https://coinmarketcap.com/currencies/axie-infinity/
https://coinmarketcap.com/currencies/vechain/
https://coinmarketcap.com/currencies/ethereum-classic/
https://coinmarketcap.com/currencies/tron/
https://coinmarketcap.com/currencies/multi-collateral-dai/
https://coinmarketcap.com/currencies/ftx-token/
https://coinmarketcap.com/currencies/tezos/
https://coinmarketcap.com/currencies/theta/
https://coinmarketcap.com/currencies/bitcoin-bep2/
https://coinmarketcap.com/currencies/fantom/
https://coinmarketcap.com/currencies/hedera/
https://coinmarketcap.com/currencies/monero/
https://coinmarketcap.com/currencies/pancakeswap/
https://coinmarketcap.com/currencies/crypto-com-coin/
https://coinmarketcap.com/currencies/elrond-egld/
https://coinmarketcap.com/currencies/eos/
https://coinmarketcap.com/currencies/ecash/
https://coinmarketcap.com/currencies/klaytn/
https://coinmarketcap.com/currencies/aave/
https://coinmarketcap.com/currencies/iota/
https://coinmarketcap.com/currencies/near-protocol/
https://coinmarketcap.com/currencies/quant/
https://coinmarketcap.com/currencies/bitcoin-sv/
https://coinmarketcap.com/currencies/the-graph/
https://coinmarketcap.com/currencies/neo/
https://coinmarketcap.com/currencies/waves/
https://coinmarketcap.com/currencies/stacks/
https://coinmarketcap.com/currencies/kusama/
https://coinmarketcap.com/currencies/terrausd/
https://coinmarketcap.com/currencies/harmony/
https://coinmarketcap.com/currencies/unus-sed-leo/
https://coinmarketcap.com/currencies/bittorrent/
https://coinmarketcap.com/currencies/maker/
https://coinmarketcap.com/currencies/omg/
https://coinmarketcap.com/currencies/amp/
https://coinmarketcap.com/currencies/helium/
https://coinmarketcap.com/currencies/celo/
https://coinmarketcap.com/currencies/dash/
https://coinmarketcap.com/currencies/chiliz/
https://coinmarketcap.com/currencies/arweave/
https://coinmarketcap.com/currencies/compound/
https://coinmarketcap.com/currencies/decred/
https://coinmarketcap.com/currencies/thorchain/
https://coinmarketcap.com/currencies/revain/
https://coinmarketcap.com/currencies/holo/
https://coinmarketcap.com/currencies/nem/
https://coinmarketcap.com/currencies/theta-fuel/
https://coinmarketcap.com/currencies/zcash/
https://coinmarketcap.com/currencies/xinfin/
https://coinmarketcap.com/currencies/icon/
https://coinmarketcap.com/currencies/decentraland/
https://coinmarketcap.com/currencies/celsius/
https://coinmarketcap.com/currencies/qtum/
https://coinmarketcap.com/currencies/trueusd/
https://coinmarketcap.com/currencies/enjin-coin/
https://coinmarketcap.com/currencies/sushiswap/
https://coinmarketcap.com/currencies/yearn-finance/
https://coinmarketcap.com/currencies/dydx/
https://coinmarketcap.com/currencies/bitcoin-gold/
https://coinmarketcap.com/currencies/huobi-token/
https://coinmarketcap.com/currencies/curve-dao-token/
https://coinmarketcap.com/currencies/flow/
https://coinmarketcap.com/currencies/mina/
https://coinmarketcap.com/currencies/mdex/
https://coinmarketcap.com/currencies/zilliqa/
https://coinmarketcap.com/currencies/synthetix-network-token/
https://coinmarketcap.com/currencies/ravencoin/
https://coinmarketcap.com/currencies/perpetual-protocol/
https://coinmarketcap.com/currencies/basic-attention-token/
https://coinmarketcap.com/currencies/ren/
https://coinmarketcap.com/currencies/serum/
https://coinmarketcap.com/currencies/renbtc/
https://coinmarketcap.com/currencies/okb/
https://coinmarketcap.com/currencies/iostoken/
https://coinmarketcap.com/currencies/telcoin/
  • Related