Home > OS >  how to iterate through a table using node and cheerio
how to iterate through a table using node and cheerio

Time:11-19

All day todo I've been working on this web scraper that pulls the date and the sunset time from sunrise-sunset.org

Iv managed to get it to scrape and display the first day and time in the table but I cant find out how to iterate through the table while using node, axios, and cheerio since im pretty new to all three.

my JSON only has one object but im trying to find out how to iterate through the table so I can have every table row as its own object. for example


[
  {
   "day": "Tue, Nov 1",
   "time": "5:59:57 pm"
  }
  {
   "day": "Tue, Nov 2",
   "time": "5:55:37 pm"
  }
  {
   "day": "Tue, Nov 2",
   "time": "5:42:47 pm"
  }
]

This is my code so far that has resulted in pulling the first day and time from the website.


app.get('/results', function (req, res) {
        axios(url)
            .then(response => {
                const html = response.data
                const $ = cheerio.load(html)
                const days = []
    
                $('#month', html).each(function () {
                    const day = $('.number', '.day').prop('innerText')
                    const time = $('.sunset', '.day').prop('innerText')
                    days.push({
                        day,
                        time
                    })
                })
                res.json(days)
            })
            .catch(err => console.log(err))
    })


I then display the results in an empty div with this code


const feedDisplay = document.querySelector('#feed')

fetch('http://localhost:8000/results')
    .then(response => { return response.json() })
    // .then(data => console.log(data))
    .then(data => {
        data.forEach(daysaray => {
            const dayItem = `<div><h3>`   daysaray.day  `</h3><p>`   daysaray.time  `</div>`
            feedDisplay.insertAdjacentHTML("beforeend", dayItem)
        })
    })
    .catch(err => console.log(err))

EDIT
This is the HTML of the table I'm trying to scrape

<table id="month">
<col>
<col>
<col>
<col>
<col>
<col>
<col>
<col>
<col>
<col>
<col>
<col>
<tbody><tr >
<th rowspan="2">Day</th>
<th rowspan="2">Twilight start</th>
<th rowspan="2">Sunrise</th>
<th rowspan="2">Sunset</th>
<th rowspan="2">Twilight end</th>
<th rowspan="2">Day length</th>
<th rowspan="2">Solar noon</th>
<th colspan="2">Nautical twilight</th>
<th colspan="2">Astronomical twilight</th>
</tr>
<tr >
<th>Start</th>
<th>End</th>
<th>Start</th>
<th>End</th>
</tr>
<tr  rel="2022-11-01">
<th><span >Tue, Nov 1</span></th>
<td>7:19:52 am</td>
<td><span  title="Sunrise time 2022-11-01 in Portland, Multnomah County, Oregon, USA">7:49:10 am</span></td>
<td><span title="Sunset time 2022-11-01 in Portland, Multnomah County, Oregon, USA" >5:59:22 pm</span></td>
<td>6:28:40 pm</td>
<td>10:10:12</td>
<td>12:54:16 pm</td>
<td>6:44 am</td>
<td>7:03 pm</td>
<td>6:10 am</td>
<td>7:38 pm</td>
</tr>
<tr  rel="2022-11-02">
<th><span >Wed, Nov 2</span></th>
<td>7:21:11 am</td>
<td><span  title="Sunrise time 2022-11-02 in Portland, Multnomah County, Oregon, USA">7:50:33 am</span></td>
<td><span title="Sunset time 2022-11-02 in Portland, Multnomah County, Oregon, USA" >5:57:57 pm</span></td>
<td>6:27:19 pm</td>
<td>10:07:24</td>
<td>12:54:15 pm</td>
<td>6:46 am</td>
<td>7:02 pm</td>
<td>6:11 am</td>
<td>7:36 pm</td>
</tr>
<tr  rel="2022-11-03">
<th><span >Thu, Nov 3</span></th>
<td>7:22:30 am</td>
<td><span  title="Sunrise time 2022-11-03 in Portland, Multnomah County, Oregon, USA">7:51:57 am</span></td>
<td><span title="Sunset time 2022-11-03 in Portland, Multnomah County, Oregon, USA" >5:56:33 pm</span></td>
<td>6:26:00 pm</td>
<td>10:04:36</td>
<td>12:54:15 pm</td>
<td>6:47 am</td>
<td>7:01 pm</td>
<td>6:12 am</td>
<td>7:35 pm</td>
</tr>
<tr  rel="2022-11-04">
<th><span >Fri, Nov 4</span></th>
<td>7:23:49 am</td>
<td><span  title="Sunrise time 2022-11-04 in Portland, Multnomah County, Oregon, USA">7:53:21 am</span></td>
<td><span title="Sunset time 2022-11-04 in Portland, Multnomah County, Oregon, USA" >5:55:11 pm</span></td>
<td>6:24:42 pm</td>
<td>10:01:50</td>
<td>12:54:16 pm</td>
<td>6:48 am</td>
<td>6:59 pm</td>
<td>6:14 am</td>
<td>7:34 pm</td>
</tr>
<tr  rel="2022-11-05">
<th><span >Sat, Nov 5</span></th>
<td>7:25:08 am</td>
<td><span  title="Sunrise time 2022-11-05 in Portland, Multnomah County, Oregon, USA">7:54:45 am</span></td>
<td><span title="Sunset time 2022-11-05 in Portland, Multnomah County, Oregon, USA" >5:53:50 pm</span></td>
<td>6:23:27 pm</td>
<td>09:59:05</td>
<td>12:54:17 pm</td>
<td>6:49 am</td>
<td>6:58 pm</td>
<td>6:15 am</td>
<td>7:33 pm</td>
</tr>
<tr  rel="2022-11-06">
<th><span >Sun, Nov 6</span></th>
<td>6:26:27 am</td>
<td><span  title="Sunrise time 2022-11-06 in Portland, Multnomah County, Oregon, USA">6:56:09 am</span></td>
<td><span title="Sunset time 2022-11-06 in Portland, Multnomah County, Oregon, USA" >4:52:31 pm</span></td>
<td>5:22:12 pm</td>
<td>09:56:22</td>
<td>11:54:20 am</td>
<td>5:51 am</td>
<td>5:57 pm</td>
<td>5:16 am</td>
<td>6:32 pm</td>
</tr>
<tr  rel="2022-11-07">
<th><span >Mon, Nov 7</span></th>
<td>6:27:46 am</td>
<td><span  title="Sunrise time 2022-11-07 in Portland, Multnomah County, Oregon, USA">6:57:32 am</span></td>
<td><span title="Sunset time 2022-11-07 in Portland, Multnomah County, Oregon, USA" >4:51:14 pm</span></td>
<td>5:21:00 pm</td>
<td>09:53:42</td>
<td>11:54:23 am</td>
<td>5:52 am</td>
<td>5:56 pm</td>
<td>5:17 am</td>
<td>6:31 pm</td>
</tr>
<tr  rel="2022-11-08">
<th><span >Tue, Nov 8</span></th>
<td>6:29:05 am</td>
<td><span  title="Sunrise time 2022-11-08 in Portland, Multnomah County, Oregon, USA">6:58:56 am</span></td>
<td><span title="Sunset time 2022-11-08 in Portland, Multnomah County, Oregon, USA" >4:49:59 pm</span></td>
<td>5:19:49 pm</td>
<td>09:51:03</td>
<td>11:54:27 am</td>
<td>5:53 am</td>
<td>5:55 pm</td>
<td>5:18 am</td>
<td>6:29 pm</td>
</tr>
<tr  rel="2022-11-09">
<th><span >Wed, Nov 9</span></th>
<td>6:30:24 am</td>
<td><span  title="Sunrise time 2022-11-09 in Portland, Multnomah County, Oregon, USA">7:00:20 am</span></td>
<td><span title="Sunset time 2022-11-09 in Portland, Multnomah County, Oregon, USA" >4:48:45 pm</span></td>
<td>5:18:41 pm</td>
<td>09:48:25</td>
<td>11:54:32 am</td>
<td>5:54 am</td>
<td>5:54 pm</td>
<td>5:20 am</td>
<td>6:28 pm</td>
</tr>
<tr  rel="2022-11-10">
<th><span >Thu, Nov 10</span></th>
<td>6:31:42 am</td>
<td><span  title="Sunrise time 2022-11-10 in Portland, Multnomah County, Oregon, USA">7:01:43 am</span></td>
<td><span title="Sunset time 2022-11-10 in Portland, Multnomah County, Oregon, USA" >4:47:33 pm</span></td>
<td>5:17:34 pm</td>
<td>09:45:50</td>
<td>11:54:38 am</td>
<td>5:56 am</td>
<td>5:53 pm</td>
<td>5:21 am</td>
<td>6:27 pm</td>
</tr>
<tr  rel="2022-11-11">
<th><span >Fri, Nov 11</span></th>
<td>6:33:01 am</td>
<td><span  title="Sunrise time 2022-11-11 in Portland, Multnomah County, Oregon, USA">7:03:07 am</span></td>
<td><span title="Sunset time 2022-11-11 in Portland, Multnomah County, Oregon, USA" >4:46:23 pm</span></td>
<td>5:16:29 pm</td>
<td>09:43:16</td>
<td>11:54:45 am</td>
<td>5:57 am</td>
<td>5:52 pm</td>
<td>5:22 am</td>
<td>6:26 pm</td>
</tr>
<tr  rel="2022-11-12">
<th><span >Sat, Nov 12</span></th>
<td>6:34:19 am</td>
<td><span  title="Sunrise time 2022-11-12 in Portland, Multnomah County, Oregon, USA">7:04:30 am</span></td>
<td><span title="Sunset time 2022-11-12 in Portland, Multnomah County, Oregon, USA" >4:45:15 pm</span></td>
<td>5:15:26 pm</td>
<td>09:40:45</td>
<td>11:54:52 am</td>
<td>5:58 am</td>
<td>5:51 pm</td>
<td>5:23 am</td>
<td>6:25 pm</td>
</tr>
<tr  rel="2022-11-13">
<th><span >Sun, Nov 13</span></th>
<td>6:35:37 am</td>
<td><span  title="Sunrise time 2022-11-13 in Portland, Multnomah County, Oregon, USA">7:05:53 am</span></td>
<td><span title="Sunset time 2022-11-13 in Portland, Multnomah County, Oregon, USA" >4:44:09 pm</span></td>
<td>5:14:25 pm</td>
<td>09:38:16</td>
<td>11:55:01 am</td>
<td>5:59 am</td>
<td>5:50 pm</td>
<td>5:24 am</td>
<td>6:25 pm</td>
</tr>
<tr  rel="2022-11-14">
<th><span >Mon, Nov 14</span></th>
<td>6:36:54 am</td>
<td><span  title="Sunrise time 2022-11-14 in Portland, Multnomah County, Oregon, USA">7:07:15 am</span></td>
<td><span title="Sunset time 2022-11-14 in Portland, Multnomah County, Oregon, USA" >4:43:05 pm</span></td>
<td>5:13:26 pm</td>
<td>09:35:50</td>
<td>11:55:10 am</td>
<td>6:01 am</td>
<td>5:49 pm</td>
<td>5:26 am</td>
<td>6:24 pm</td>
</tr>
<tr  rel="2022-11-15">
<th><span >Tue, Nov 15</span></th>
<td>6:38:12 am</td>
<td><span  title="Sunrise time 2022-11-15 in Portland, Multnomah County, Oregon, USA">7:08:37 am</span></td>
<td><span title="Sunset time 2022-11-15 in Portland, Multnomah County, Oregon, USA" >4:42:03 pm</span></td>
<td>5:12:28 pm</td>
<td>09:33:26</td>
<td>11:55:20 am</td>
<td>6:02 am</td>
<td>5:48 pm</td>
<td>5:27 am</td>
<td>6:23 pm</td>
</tr>
<tr  rel="2022-11-16">
<th><span >Wed, Nov 16</span></th>
<td>6:39:28 am</td>
<td><span  title="Sunrise time 2022-11-16 in Portland, Multnomah County, Oregon, USA">7:09:59 am</span></td>
<td><span title="Sunset time 2022-11-16 in Portland, Multnomah County, Oregon, USA" >4:41:03 pm</span></td>
<td>5:11:34 pm</td>
<td>09:31:04</td>
<td>11:55:31 am</td>
<td>6:03 am</td>
<td>5:47 pm</td>
<td>5:28 am</td>
<td>6:22 pm</td>
</tr>
<tr  rel="2022-11-17">
<th><span >Thu, Nov 17</span></th>
<td>6:40:45 am</td>
<td><span  title="Sunrise time 2022-11-17 in Portland, Multnomah County, Oregon, USA">7:11:21 am</span></td>
<td><span title="Sunset time 2022-11-17 in Portland, Multnomah County, Oregon, USA" >4:40:05 pm</span></td>
<td>5:10:41 pm</td>
<td>09:28:44</td>
<td>11:55:43 am</td>
<td>6:04 am</td>
<td>5:46 pm</td>
<td>5:29 am</td>
<td>6:21 pm</td>
</tr>
<tr  rel="2022-11-18">
<th><span >Fri, Nov 18</span></th>
<td>6:42:01 am</td>
<td><span  title="Sunrise time 2022-11-18 in Portland, Multnomah County, Oregon, USA">7:12:42 am</span></td>
<td><span title="Sunset time 2022-11-18 in Portland, Multnomah County, Oregon, USA" >4:39:09 pm</span></td>
<td>5:09:50 pm</td>
<td>09:26:27</td>
<td>11:55:55 am</td>
<td>6:05 am</td>
<td>5:45 pm</td>
<td>5:30 am</td>
<td>6:21 pm</td>
</tr>
<tr  rel="2022-11-19">
<th><span >Sat, Nov 19</span></th>
<td>6:43:16 am</td>
<td><span  title="Sunrise time 2022-11-19 in Portland, Multnomah County, Oregon, USA">7:14:02 am</span></td>
<td><span title="Sunset time 2022-11-19 in Portland, Multnomah County, Oregon, USA" >4:38:15 pm</span></td>
<td>5:09:01 pm</td>
<td>09:24:13</td>
<td>11:56:09 am</td>
<td>6:07 am</td>
<td>5:45 pm</td>
<td>5:31 am</td>
<td>6:20 pm</td>
</tr>
<tr  rel="2022-11-20">
<th><span >Sun, Nov 20</span></th>
<td>6:44:31 am</td>
<td><span  title="Sunrise time 2022-11-20 in Portland, Multnomah County, Oregon, USA">7:15:22 am</span></td>
<td><span title="Sunset time 2022-11-20 in Portland, Multnomah County, Oregon, USA" >4:37:24 pm</span></td>
<td>5:08:15 pm</td>
<td>09:22:02</td>
<td>11:56:23 am</td>
<td>6:08 am</td>
<td>5:44 pm</td>
<td>5:33 am</td>
<td>6:19 pm</td>
</tr>
<tr  rel="2022-11-21">
<th><span >Mon, Nov 21</span></th>
<td>6:45:45 am</td>
<td><span  title="Sunrise time 2022-11-21 in Portland, Multnomah County, Oregon, USA">7:16:41 am</span></td>
<td><span title="Sunset time 2022-11-21 in Portland, Multnomah County, Oregon, USA" >4:36:35 pm</span></td>
<td>5:07:31 pm</td>
<td>09:19:54</td>
<td>11:56:38 am</td>
<td>6:09 am</td>
<td>5:43 pm</td>
<td>5:34 am</td>
<td>6:19 pm</td>
</tr>
<tr  rel="2022-11-22">
<th><span >Tue, Nov 22</span></th>
<td>6:46:58 am</td>
<td><span  title="Sunrise time 2022-11-22 in Portland, Multnomah County, Oregon, USA">7:17:59 am</span></td>
<td><span title="Sunset time 2022-11-22 in Portland, Multnomah County, Oregon, USA" >4:35:48 pm</span></td>
<td>5:06:49 pm</td>
<td>09:17:49</td>
<td>11:56:54 am</td>
<td>6:10 am</td>
<td>5:43 pm</td>
<td>5:35 am</td>
<td>6:18 pm</td>
</tr>
<tr  rel="2022-11-23">
<th><span >Wed, Nov 23</span></th>
<td>6:48:11 am</td>
<td><span  title="Sunrise time 2022-11-23 in Portland, Multnomah County, Oregon, USA">7:19:17 am</span></td>
<td><span title="Sunset time 2022-11-23 in Portland, Multnomah County, Oregon, USA" >4:35:04 pm</span></td>
<td>5:06:10 pm</td>
<td>09:15:47</td>
<td>11:57:10 am</td>
<td>6:11 am</td>
<td>5:42 pm</td>
<td>5:36 am</td>
<td>6:18 pm</td>
</tr>
<tr  rel="2022-11-24">
<th><span >Thu, Nov 24</span></th>
<td>6:49:23 am</td>
<td><span  title="Sunrise time 2022-11-24 in Portland, Multnomah County, Oregon, USA">7:20:34 am</span></td>
<td><span title="Sunset time 2022-11-24 in Portland, Multnomah County, Oregon, USA" >4:34:22 pm</span></td>
<td>5:05:32 pm</td>
<td>09:13:48</td>
<td>11:57:28 am</td>
<td>6:12 am</td>
<td>5:42 pm</td>
<td>5:37 am</td>
<td>6:17 pm</td>
</tr>
<tr  rel="2022-11-25">
<th><span >Fri, Nov 25</span></th>
<td>6:50:34 am</td>
<td><span  title="Sunrise time 2022-11-25 in Portland, Multnomah County, Oregon, USA">7:21:50 am</span></td>
<td><span title="Sunset time 2022-11-25 in Portland, Multnomah County, Oregon, USA" >4:33:42 pm</span></td>
<td>5:04:57 pm</td>
<td>09:11:52</td>
<td>11:57:46 am</td>
<td>6:13 am</td>
<td>5:41 pm</td>
<td>5:38 am</td>
<td>6:17 pm</td>
</tr>
<tr  rel="2022-11-26">
<th><span >Sat, Nov 26</span></th>
<td>6:51:45 am</td>
<td><span  title="Sunrise time 2022-11-26 in Portland, Multnomah County, Oregon, USA">7:23:05 am</span></td>
<td><span title="Sunset time 2022-11-26 in Portland, Multnomah County, Oregon, USA" >4:33:05 pm</span></td>
<td>5:04:25 pm</td>
<td>09:10:00</td>
<td>11:58:05 am</td>
<td>6:15 am</td>
<td>5:41 pm</td>
<td>5:39 am</td>
<td>6:16 pm</td>
</tr>
<tr  rel="2022-11-27">
<th><span >Sun, Nov 27</span></th>
<td>6:52:54 am</td>
<td><span  title="Sunrise time 2022-11-27 in Portland, Multnomah County, Oregon, USA">7:24:19 am</span></td>
<td><span title="Sunset time 2022-11-27 in Portland, Multnomah County, Oregon, USA" >4:32:30 pm</span></td>
<td>5:03:55 pm</td>
<td>09:08:11</td>
<td>11:58:24 am</td>
<td>6:16 am</td>
<td>5:40 pm</td>
<td>5:40 am</td>
<td>6:16 pm</td>
</tr>
<tr  rel="2022-11-28">
<th><span >Mon, Nov 28</span></th>
<td>6:54:03 am</td>
<td><span  title="Sunrise time 2022-11-28 in Portland, Multnomah County, Oregon, USA">7:25:31 am</span></td>
<td><span title="Sunset time 2022-11-28 in Portland, Multnomah County, Oregon, USA" >4:31:58 pm</span></td>
<td>5:03:27 pm</td>
<td>09:06:27</td>
<td>11:58:45 am</td>
<td>6:17 am</td>
<td>5:40 pm</td>
<td>5:41 am</td>
<td>6:15 pm</td>
</tr>
<tr  rel="2022-11-29">
<th><span >Tue, Nov 29</span></th>
<td>6:55:10 am</td>
<td><span  title="Sunrise time 2022-11-29 in Portland, Multnomah County, Oregon, USA">7:26:43 am</span></td>
<td><span title="Sunset time 2022-11-29 in Portland, Multnomah County, Oregon, USA" >4:31:28 pm</span></td>
<td>5:03:02 pm</td>
<td>09:04:45</td>
<td>11:59:06 am</td>
<td>6:18 am</td>
<td>5:39 pm</td>
<td>5:42 am</td>
<td>6:15 pm</td>
</tr>
<tr  rel="2022-11-30">
<th><span >Wed, Nov 30</span></th>
<td>6:56:16 am</td>
<td><span  title="Sunrise time 2022-11-30 in Portland, Multnomah County, Oregon, USA">7:27:54 am</span></td>
<td><span title="Sunset time 2022-11-30 in Portland, Multnomah County, Oregon, USA" >4:31:01 pm</span></td>
<td>5:02:39 pm</td>
<td>09:03:07</td>
<td>11:59:27 am</td>
<td>6:19 am</td>
<td>5:39 pm</td>
<td>5:43 am</td>
<td>6:15 pm</td>
</tr>
</tbody></table>

CodePudding user response:

The pattern $("some id").each doesn't make much sense. Identifiers are supposed to be unique in a valid HTML document, so this says "select something guaranteed to be at most a single element, then iterate that single element". Sure, some pages abuse identifiers as classes so this is conceivably useful in rare invalid HTML cases, but that doesn't apply here.

Instead, try selecting <tr> rows as described in Scraping all rows from table using cheerio and Parse table from HTML with cheerio in node, then loop (or map) over those:

const axios = require("axios");
const cheerio = require("cheerio"); // 1.0.0-rc.12

const url = "https://sunrise-sunset.org/search?location=portland oregon&year=2022&month=11#calendar";

axios.get(url).then(({data: html}) => {
  const $ = cheerio.load(html);
  const rows = [...$("#month tr.day")].map(e => ({
    day: $(e).find(".number").text().trim(),
    time: $(e).find(".sunset").text().trim(),
  }));
  console.log(rows);
});
  • Related