How can I extract the HTML from TraderWagon? There's a lot of similar questions on SO and I've tried numerous ways to solve this like this, , but none have worked.
Attempt #1 - request Beautifulsoup
There should be <table>
tags in the doc
object, but there's none.
from bs4 import BeautifulSoup
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
url = "https://www.traderwagon.com/en/portfolio/156"
result = requests.get(url, headers=headers)
doc = BeautifulSoup(result.text, "html.parser")
tableRows = doc.findAll(text="USDT")
print(tableRows)
Attempt #2 - Selenium
import time
from bs4 import BeautifulSoup
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
url = "https://www.traderwagon.com/en/portfolio/156"
#options = Options()
#options.add_argument('--headless')
#options.add_argument('--disable-gpu')
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get(url)
time.sleep(3)
page = driver.page_source
driver.quit()
soup = BeautifulSoup(page, 'lxml')
container = soup.find_all('USDT')
print(container)
CodePudding user response:
The web page is completely dynamic and can't get the html in static way. Use API
to extract data or use selenium to grab the dynamic content. You are getting empty ResultSet
because the element selection isn't correct way. The following example selenium with bs4
is working and producing the expected output.
import time
from bs4 import BeautifulSoup
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
url = "https://www.traderwagon.com/en/portfolio/156"
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get(url)
driver.maximize_window()
time.sleep(3)
page = driver.page_source
soup = BeautifulSoup(page, 'lxml')
for trs in soup.table.find_all('tr'):
tr=list(trs.stripped_strings)
print(tr)
#driver.quit()
Output:
['Symbol', 'Leverage | Direction', 'Size', 'Entry Price', 'Mark Price', 'Margin', 'PNL (ROE %)']
[]
['WAVESUSDT', '10X |', 'Short', '-79,639.86 USDT', '24.89 USDT', '4.00 USDT', '1,279.36 (Cross)', '66,846.27 USDT (5224.98%)']
['HNTUSDT', '20X |', 'Short', '-8,003.19 USDT', '11.43 USDT', '4.60 USDT', '798.66 (Cross)', '4,780.05 USDT (598.51%)']
['ETCUSDT', '20X |', 'Short', '-4,703.56 USDT', '47.04 USDT', '28.72 USDT', '143.62 (Cross)', '1,831.07 USDT (1274.90%)']
['TRXUSDT', '20X |', 'Short', '-31,684.99 USDT', '0.09 USDT', '0.06 USDT', '1,631.48 (Cross)', '9,405.30 USDT (576.49%)']
['STORJUSDT', '20X |', 'Short', '-2,760.00 USDT', '0.92 USDT', '0.47 USDT', '204.05 (Cross)', '1,343.98 USDT (658.65%)']
['SNXUSDT', '20X |', 'Short', '-11,961.99 USDT', '3.62 USDT', '2.49 USDT', '410.23 (Cross)', '3,757.41 USDT (915.93%)']
['KAVAUSDT', '20X |', 'Short', '-3,552.00 USDT', '2.96 USDT', '1.55 USDT', '93.10 (Cross)', '1,689.99 USDT (1815.24%)']
['RUNEUSDT', '5X |', 'Short', '-30,462.80 USDT', '8.23 USDT', '1.66 USDT', '1,228.50 (Cross)', '24,320.29 USDT (1979.67%)']
['APEUSDT', '20X |', 'Short', '-4,900.00 USDT', '24.50 USDT', '5.57 USDT', '55.72 (Cross)', '3,785.60 USDT (6793.97%)']
['BNXUSDT', '20X |', 'Short', '-67,123.04 USDT', '147.52 USDT', '158.59 USDT', '4,336.42 (Cross)', '-5,035.41 USDT (-116.12%)']