Error trying to get page title in google sheets-CodePudding

Can you help me with the correct XPath to select the text of the title "Quotes to Scrape" contained in the <a> tag inside an <h1> tag on the following webpage?

I need to use this XPath on the IMPORTXML function in Google Sheets, but I'm not sure if the XPath is correct.

=IMPORTXML("https://quotes.toscrape.com/","//div[@class='col-md-8']/h1/a")

I have an error in Google Sheets. I am expecting to get the text inside /h1/a.

CodePudding user response：

Got it

=IMPORTXML("http://quotes.toscrape.com/";"//div[@class='col-md-8']/h1/a")

The answer was semicolon instead of comma

CodePudding user response：

I'm writing this answer as a community wiki since the solution was provided by @margusl in the comments section, for visibility to the community.

The issue was related to a typo with the formula. The formula with the issue was

=IMPORTHTML("https://quotes.toscrape.com/","//div[@class='col-md-8']/h1/a")

However, IMPORTHTML doesn't use XPath, it uses a query like "List" or "Table" as mentioned here.

So to fix the issue, you need to fix the typo and use:

=IMPORTXML("http://quotes.toscrape.com/","//div[@class='col-md-8']/h1/a")

Or you can also use:

=IMPORTXML("http://quotes.toscrape.com/","/html/body/div/div[1]/div[1]/h1/a")