Hi, Please how can i get the name of dataset in kaggle, usign beatiful soup or selenium or scrapy. I test this code but no return :
from bs4 import BeautifulSoup
import requests
url = 'https://www.kaggle.com/heptapod/titanic'
res = requests.get(url)
html_page = res.content
soup = BeautifulSoup(html_page, 'html.parser')
datasetName = soup.find('h5',{'class':'sc-dIvrsQ sc-hHEiqL sc-kaPsuu kSVYRu ccTnQh ffXPrd'})
print(datasetName)
see the picture : inspect element from kaggle
CodePudding user response:
Using Selenium
from selenium.webdriver.chrome.options import Options
opt = Options()
opt.add_argument('--headless')
driver = webdriver.Chrome(executable_path = 'yourdriverpath', options=opt)
driver.get("https://www.kaggle.com/heptapod/titanic")
time.sleep(5)
datasetname = driver.find_element(By.XPATH, "//div[@role='button']//div//div").text
print(datasetname)
Output:
train_and_test2.csv
Process finished with exit code 0