Scraping text in meta tag with selenium-CodePudding

I'm trying to get the book description from the following webpage: https://bookshop.org/books/lucky-9798200961177/9781668002452

This is what I've got so far

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("--headless")

driver = webdriver.Chrome('path_to_my_driver_on_local', options=options)
driver.get('https://bookshop.org/a/16709/9781668002452')
description = driver.find_elements_by_xpath('//meta[@content]')[0].text
description

Basically, I'm trying to get the text inside of this html:


<meta name="description" content="REESE'S BOOK CLUB PICK NEW YORK TIMES BESTSELLER A thrilling roller-coaster ride about a heist gone terribly wrong, with a plucky protagonist who will win readers' hearts. What if you had the winning ticket ....">

but I couldn't locate the text in the content. Can anyone advise how to get to the text in the content?

CodePudding user response：

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
options = Options()
options.add_argument("--headless")

driver = webdriver.Chrome('path_to_my_driver_on_local', options=options)

driver.get('https://bookshop.org/books/lucky-9798200961177/9781668002452')
html = driver.page_source
soup = BeautifulSoup(html, 'lxml')
Description = soup.find_all('div', class_="title-description")
print(Description[0].text)

CodePudding user response：

elem=driver.find_element(By.XPATH,"//meta[@name='description']")
print(elem.get_attribute("content"))

You can use a more inclusive xpath. Then target the attribute for content.

Imports:

from selenium.webdriver.common.by import By