Home > Net >  Don't use Parser span
Don't use Parser span

Time:04-20

I want get data ("discount") form url = https://www.lazada.sg/puma-singapore/?q=All-Products&from=wangpu&langFlag=en&pageTypeId=2

But not get

function myFunction() {
 const url = 'https://www.lazada.sg/puma-singapore/?q=All-Products&from=wangpu&langFlag=en&pageTypeId=2'
  // parse the data 
    

function getData(url) {
    const fromText = '<span  data-spm-anchor-id="a2o42.seller.list.i41.62ff63deVng91O">';
    const toText = '</span>';
    const content = UrlFetchApp.fetch(url).getContentText();
    const scraped = Parser
                    .data(content)
                    .setLog()     
                    .from(fromText)
                    .to(toText)
                    .build();
    return scraped;
}

  const discount = getData(url).replace("%", "").replace(/\-/g,"");
  Logger.log(discount)
}

CodePudding user response:

When I saw the HTML of the URL, it seems that the values are put using Javascript. But, fortunately, the values are included in the HTML as the JSON data. So, in this answer, I would like to propose retrieving the value by parsing the JSON data in HTML. The sample script is as follows.

Sample script:

Please set the item name you want to retrieve the value of discount.

function myFunction() {
  const itemName = "PUMA Unisex Deck Backpack II"; // Please set the item name.

  const url = 'https://www.lazada.sg/puma-singapore/?q=All-Products&from=wangpu&langFlag=en&pageTypeId=2'
  const content = UrlFetchApp.fetch(url).getContentText();
  const str = content.match(/window.pageData =([\w\s\S] ?});/);
  if (!str || str.length < 1) return;
  const obj = JSON.parse(str[1]);
  const items = obj.mods.listItems.filter(({ name }) => name == itemName);
  if (items.length == 0) return;
  const res = items.map(({ discount }) => discount);
  console.log(res)
}

Testing:

  • When this script is run, [ '-34%', '-34%' ] is obtained. Because there are 2 items of PUMA Unisex Deck Backpack II. So, the result has 2 values.

Note:

  • In the current stage, I can confirm that this script works. But, if the structure of HTML is changed in the future, this script might not be able to be used. Please be careful about this.

References:

Added:

For my question About your additional request of your comment of Thanks for the support However I want to get all the data of all products Is there any way?, you want the discount values of all items. Is my understanding correct? of your additional request,

yes, i got all data of discount, but i got value"undefined", i want get value form "46" or "56" and setValues on MySheet.

In this case, the sample script is as follows.

Sample script:

function myFunction() {
  const url = 'https://www.lazada.sg/puma-singapore/?q=All-Products&from=wangpu&langFlag=en&pageTypeId=2'
  const content = UrlFetchApp.fetch(url).getContentText();
  const str = content.match(/window.pageData =([\w\s\S] ?});/);
  if (!str || str.length < 1) return;
  const obj = JSON.parse(str[1]);
  const items = obj.mods.listItems;
  if (items.length == 0) return;
  const res = items.map(({ discount }) => [-parseInt(discount, 10) || 0]);

  const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet1"); // Please set the sheet name.
  sheet.getRange(1, 1, res.length, res[0].length).setValues(res);
}
  • It seems that when the value of discount is undefined, the product is not discounted. So in this case, it's 0%.

  • And, in order to achieve your goal that you want to retrieve the values of -46% as 46, I used -parseInt(discount, 10) || 0.

  • When this script is run, the retrieved values are put to the column "A" of "Sheet1". Unfortunately, from i want get value form "46" or "56" and setValues on MySheet, I couldn't imagine your actual goal. So, please modify this script for your actual goal.

  • If you want to retrieve the product name and discount value, please modify as follows.

    • From

        const res = items.map(({ discount }) => [-parseInt(discount, 10) || 0]);
      
    • To

        const res = items.map(({ name, discount }) => [name, -parseInt(discount, 10) || 0]);
      
  • Related