Home > other >  Can't scrape text with cheerio
Can't scrape text with cheerio

Time:07-02

i'm trying to scrape this page with cheerio enter image description here

Here's my code

app.get("/conjugation", (req, res) => {
  axios(
    "https://en.dict.naver.com/#/search?query=추워요&range=all"
  )
    .then((response) => {
      const htmlData = response.data;
      const $ = cheerio.load(htmlData);
      const element = $(
        "#searchPage_entry > h3 > span.title_text.myScrollNavQuick.my_searchPage"
      );
      console.log(element.text());
    })
    .catch((err) => console.log(err));
});

CodePudding user response:

The server at that URL doesn't return any body DOM structure in the HTML response. The body DOM is rendered by linked JavaScript after the response is received. Cheerio doesn't execute the JavaScript in the HTML response, so it won't be possible to scape that page using Cheerio. Instead, you'll need to use another method which can execute the in-page JavaScript (e.g. Puppeteer).

CodePudding user response:

This is a common issue while web scraping, the page loads dynamically, that's why when you fetch the content of the initial get response from that website, all you're getting is script tags, print the htmlData so you can see what I mean. There are no loaded html elements in your response, what you'll have to do is use something like selenium to wait for the elements that you're requiring to get rendered.

  • Related