Home > Back-end >  Improving a loops and tasks to scrape multiple paths that I need to match values
Improving a loops and tasks to scrape multiple paths that I need to match values

Time:06-21

The page elements are in this pattern:

<ul >
 <li >
   <span >
       <a href="/players/maximilian-mittelstadt/359295/">
         M. Mittelstädt
       </a>

       

         <span >
           3'
         </span>

       

         <span >
           (assist by
           <a href="/players/salomon-kalou/2540/">S. Kalou</a>)
         </span>

   </span>

   <span >1 - 0</span>

   <span >
  </span>
   <div ></div>
 </li>
 <li >
   <span >
   </span>

   <span >1 - 1</span>

   <span >

         <span >
           7'
         </span>


       <a href="/players/serge--gnabry/213651/">
         S. Gnabry
       </a>

       

  </span>
   <div ></div>
 </li>
 <li >
   <span >
   </span>

   <span >1 - 2</span>

   <span >

         <span >
           49'
         </span>


       <a href="/players/serge--gnabry/213651/">
         S. Gnabry
       </a>

       

         <span >
           (assist by
           <a href="/players/james-david-rodriguez/72408/">J. Rodríguez</a>)
         </span>
  </span>
   <div ></div>
 </li>
 <li >
   <span >
       <a href="/players/davie-selke/213931/">
         D. Selke
       </a>

       

         <span >
           67'
         </span>

       


   </span>

   <span >2 - 2</span>

   <span >
  </span>
   <div ></div>
 </li>
 <li >
   <span >
   </span>

   <span >2 - 3</span>

   <span >

         <span >
           98'
         </span>


       <a href="/players/kingsley-coman/265385/">
         K. Coman
       </a>

       

         <span >
           (assist by
           <a href="/players/robert-lewandowski/41310/">R. Lewandowski</a>)
         </span>
  </span>
   <div ></div>
 </li>
</ul>

The values that are collected by the patches if i need are:

span.minute span.score
3' 1 - 0
7' 1 - 1
49' 1 - 2
67' 2 - 2
98' 2 - 3

Every span.minute contains a span.score and I want to find the last span.score that span.minute is less than or equal to 90. In this example, the last one is 672-2

My Code:

function score() {
  var ss = SpreadsheetApp.getActive().getSheetByName('copy');
  var response = UrlFetchApp.fetch('URL URL URL URL', {muteHttpExceptions: true});
  if (response.getResponseCode() == 404) {
  } else {
    var contentText = response.getContentText();
    var $ = Cheerio.load(contentText);
    
    var list_minutes = [];
    var list_score = [];

    var minute_goal = $('ul.scorer-info > li > span.scorer > span.minute');
    var score_goal = $('ul.scorer-info > li > span.score');

    minute_goal.each((index, element) => {list_minutes.push([($(element).text().trim()).substring(0, ($(element).text().trim()).indexOf("'"))]);});
    score_goal.each((index, element) => {list_score.push([($(element).text().trim()).replace(/ /g,'')]);});

    var before_90 = '0-0';

    var i=0;
    var max = list_minutes.length
    for(i; i<max; i  ){
      if (list_minutes[i][0] <= 90) {
        before_90 = list_score[i][0];
      }
    }
    Logger.log(before_90)
  }
}

Output Logger.log:

2-2

As seen, this way I need to create several lists until I can in a last loop to find the last value of the list before the 90'.

Is there an improved way to reduce this amount of tasks and at least generate a single loop between paths to reduce the code size and its execution time?

CodePudding user response:

In order to retrieve your expected value, how about the following modification?

From

var list_minutes = [];
var list_score = [];

var minute_goal = $('ul.scorer-info > li > span.scorer > span.minute');
var score_goal = $('ul.scorer-info > li > span.score');

minute_goal.each((index, element) => {list_minutes.push([($(element).text().trim()).substring(0, ($(element).text().trim()).indexOf("'"))]);});
score_goal.each((index, element) => {list_score.push([($(element).text().trim()).replace(/ /g,'')]);});

var before_90 = '0-0';

var i=0;
var max = list_minutes.length
for(i; i<max; i  ){
  if (list_minutes[i][0] <= 90) {
    before_90 = list_score[i][0];
  }
}
Logger.log(before_90)

To:

var minute_goal = $('ul.scorer-info > li > span.scorer > span.minute').toArray();
var score_goal = $('ul.scorer-info > li > span.score').toArray();
var res = minute_goal.reduce((ar, e, i) => {
  var n = parseInt($(e).text(), 10);
  if (n <= 90) ar.push([n, $(score_goal[i]).text().trim().replace(/ /g, '')]);
  return ar;
}, []).pop();
console.log(res) // [ 67, '2-2' ]
  • When this modified script is run for your showing HTML data, you can see the value of [ 67, '2-2' ] at the log.
  • Related