Having done research on selecting an html element of a given class, I'm stuck being
unable to select a table
element with a class named "treasuries-table"
from https://www.buybitcoinworldwide.com/treasuries/. I've noticed that there is
<table >
on the page but it is not my target, only
<table >
is. I've tried
//table[contains(concat(" ", normalize-space(@class), " "), " treasuries-table ")]
as well as the less verbose, but not necessarily correct,
//table[@]
, all this to no avail.
Where am I wrong? I usually try first to find elements with this online tester, might it be the tester's fault?
P.S. Apologies if it does seem as a duplicate but the solutions mentioned in similar questions does not seem to work for me.
CodePudding user response:
Actually,'//table[contains(@class,"treasuries-table")]'
is selecting 5 tables but '//table[@)]' is selecting the first table which is equivalent to (//table[contains(@class,"treasuries-table")])[1]
Try:
'//table[contains(@class,"treasuries-table")]'
CodePudding user response:
hover over the element in inspect window the right click on mouse > copy > Copy XPath
I am not sure which scraper are you using but it return data to me using beautiful soup
sp = soup.find_all("table", attrs={"class":"treasuries-table"})[1]
this returns
Entity Country Symbol:Exchange Filings & Sources # of BTC Value Today % of 21m MicroStrategy MSTR:NADQ Filing | News 129,218 129218 0.615% Tesla, Inc TSLA:NADQ Filing | News 42,902 42902 0.204% Galaxy Digital Holdings BRPHF:OTCMKTS Filing | News 16,400 16400 0.078% Voyager Digital LTD VOYG:TSX Filing | News 12,260 12260 0.058% Marathon Digital Holdings Inc MARA:NADQ Filing | News 9,373 9373 0.045% Square Inc. SQ:NYSE Filing | News 8,027 8027 0.038% Hut 8 Mining Corp HUT:NASDAQ Filing | News 6,460 6460 0.031% Riot Blockchain, Inc. RIOT:NADQ Filing | News 6,320 6320 0.03% Bitfarms Limited BITF:NASDAQ Filing | News 5,646 5646 0.027% Core Scientific CORZ:NASDAQ Filing | News 5,296 5296 0.025% Coinbase Global, Inc. COIN:NADQ Filing | News 4,482 4482 0.021% Bitcoin Group SE BTGGF:TCMKTS Filing | News 3,947 3947 0.019% Hive Blockchain HIVE:NASDAQ Filing | News 2,832 2832 0.013% Argo Blockchain PLC ARBKF:OTCMKTS Filing | News 2,685 2685 0.013% NEXON Co. Ltd NEXOF:OTCMKTS Filing | News 1,717 1717 0.008% Exodus Movement Inc :OTCMKTS Filing | News 1,300 1300 0.006% Brooker Group's BROOK (BKK) BROOK:BKK Filing | News 1,150 1150 0.005% Meitu HKD:HKG Filing | News 941 941 0.004% Bit Digital, Inc. BTBT:NASDAQ Filing | News 832 832 0.004% Digihost Technology Inc.
HSSHF:OTCMKTS Filing | News 797 797 0.004% BIGG Digital Assets Inc. BBKCF:OTCMKTS Filing | News 575 575 0.003% DMG Blockchain Solutions Inc. DMGGF:OTCMKTS Filing | News 432 432 0.002%
CleanSpark Inc CLSK:NASDAQ Filing | News 420 420 0.002% Cypherpunk Holdings Inc. HODL:OTCMKTS Filing | News 386 386 0.002% Advanced Bitcoin Technologies AG ABT:DUS Filing | News 254 254 0.001% DigitalX DGGXF:OTCMKTS Filing | News 216 216 0.001% Neptune Digital Assets NPPTF:OTCMKTS Filing | News 194 194 0.001% Cathedra Bitcoin Inc (Fortress Blockchain) CBIT:CVE Filing | News 169 169 0.001% MercadoLibre, Inc. MELI:NADQ Filing | News 150 150 0.001% LQwD
FinTech Corp OTC:INLAF Filing | News 139 139 0.001% Banxa Holdings Inc BNXAF:OTCMKTS Filing | News 136 136 0.001% Phunware, Inc. PHUN:NASDAQ Filing | News 127 127 0.001% BTCS Inc.
BTCS:OTCMKTS Filing | News 90 90 0.0% FRMO Corp. FRMO:OTCMKTS Filing | News 63 63 0.0% Canada Computational Unlimited Corp. SATO:TSXV Filing | News 37 37 0.0% Metromile MILE:NASDAQ Filing | News 25 25 0.0% MOGO Financing MOGO:Nasdaq Filing | News 18 18 0.0% Net Holding
Anonim Sirketi NTHOL TI:IST Filing | News 3 3 0.0% Totals: 266019 266019 1.267%
anyway just add [1] to display the second element of the list as there are two tables with the same class name, the first one with 0 index and the second one 1 index for selenium use
driver.find_elements(By.CLASS_NAME, value="treasuries-table")[1]
hope it helps
CodePudding user response:
Do you need to use xpath here? Or even Selenium for that matter? I'd also consider using pandas
to let that parse the <table>
tags. This returns the 4 tables with the treasuries-table
class. I'm just printing out the 1st one here.
import pandas as pd
import requests
response = requests.get('https://www.buybitcoinworldwide.com/treasuries/')
df = pd.read_html(response.text, attrs={'class':'treasuries-table'})[0]
Output:
print(df)
Entity Country ... Value Today % of 21m
0 NaN NaN ... 266019 1.267%
1 MicroStrategy NaN ... 129218 0.615%
2 Tesla, Inc NaN ... 42902 0.204%
3 Galaxy Digital Holdings NaN ... 16400 0.078%
4 Voyager Digital LTD NaN ... 12260 0.058%
5 Marathon Digital Holdings Inc NaN ... 9373 0.045%
6 Square Inc. NaN ... 8027 0.038%
7 Hut 8 Mining Corp NaN ... 6460 0.031%
8 Riot Blockchain, Inc. NaN ... 6320 0.03%
9 Bitfarms Limited NaN ... 5646 0.027%
10 Core Scientific NaN ... 5296 0.025%
11 Coinbase Global, Inc. NaN ... 4482 0.021%
12 Bitcoin Group SE NaN ... 3947 0.019%
13 Hive Blockchain NaN ... 2832 0.013%
14 Argo Blockchain PLC NaN ... 2685 0.013%
15 NEXON Co. Ltd NaN ... 1717 0.008%
16 Exodus Movement Inc NaN ... 1300 0.006%
17 Brooker Group's BROOK (BKK) NaN ... 1150 0.005%
18 Meitu NaN ... 941 0.004%
19 Bit Digital, Inc. NaN ... 832 0.004%
20 Digihost Technology Inc. NaN ... 797 0.004%
21 BIGG Digital Assets Inc. NaN ... 575 0.003%
22 DMG Blockchain Solutions Inc. NaN ... 432 0.002%
23 CleanSpark Inc NaN ... 420 0.002%
24 Cypherpunk Holdings Inc. NaN ... 386 0.002%
25 Advanced Bitcoin Technologies AG NaN ... 254 0.001%
26 DigitalX NaN ... 216 0.001%
27 Neptune Digital Assets NaN ... 194 0.001%
28 Cathedra Bitcoin Inc (Fortress Blockchain) NaN ... 169 0.001%
29 MercadoLibre, Inc. NaN ... 150 0.001%
30 LQwD FinTech Corp NaN ... 139 0.001%
31 Banxa Holdings Inc NaN ... 136 0.001%
32 Phunware, Inc. NaN ... 127 0.001%
33 BTCS Inc. NaN ... 90 0.0%
34 FRMO Corp. NaN ... 63 0.0%
35 Canada Computational Unlimited Corp. NaN ... 37 0.0%
36 Metromile NaN ... 25 0.0%
37 MOGO Financing NaN ... 18 0.0%
38 Net Holding Anonim Sirketi NaN ... 3 0.0%
[39 rows x 7 columns]
CodePudding user response:
Have you tried this ?
table[class*="treasuries-table"]
or
table[class*=" treasuries-table "]