Home > Software engineering >  Scraping website, can't nest data with similar name
Scraping website, can't nest data with similar name

Time:12-20

I'm looping through some data, which I'm scraping from some websites. Currently I'm scraping the head.

This is an example of the data structure

const head = {
    rel_data: [
      {
        rel: "rel",
        items: [
          {
            type: "type",
            sizes: "sizes",
            href: "href"
          }
        ]
      }
    ]
};

Whenever the rel matches, I want to insert the data into items

$('head link').each(function(index) {
if(head?.rel_data[index]?.rel == rel) {
  head?.rel_data[index]?.items.push({
    type: (type !== undefined) ? type : null,
    sizes: (sizes !== undefined) ? sizes : null,
    href: (href !== undefined) ? href : null
  });
} else {
  head.rel_data.push({
    rel: (rel !== undefined) ? rel : null,
    items: [
      {
        type: (type !== undefined) ? type : null,
        sizes: (sizes !== undefined) ? sizes : null,
        href: (href !== undefined) ? href : null
      }
    ]
  });
}
})

Like this

rel_data: [
  {
    rel: "icon",
    items: [
      {
        type: "type",
        sizes: "sizes",
        href: "href"
      },
      {
        type: "type",
        sizes: "sizes",
        href: "href"
      }
    ]
  },
  {
    rel: "other-rel-type",
    items: [...]
  }
]

But what I get is this.

rel_data: [
  {
    rel: "icon",
      items: [
        {
          type: "type",
          sizes: "sizes",
          href: "href"
              }
    ]
  },
  {
    rel: "icon",
      items: [
        {
          type: "type",
          sizes: "sizes",
          href: "href"
      }
    ]
  }
]

If I write 0, instead of index, it works with the first type of rel (icon for example) but not the rest?

CodePudding user response:

A simple solution would be to store the data in a temporary object rather than an array and use the rel values as keys.

Then when you are done use Object.values(tempObject) to get the final array

This object would look something like:

const obj = {
  "icon": {
    rel: "icon",
    items: [{
        type: "type",
        sizes: "sizes",
        href: "href"
      }

    ]
  },

  "other-rel-type": {
    rel: "other-rel-type",
    items: []
  }
}

Then a simplified version of your loop would be something like:

$('head link').each(function(index) {
    const rel = this.rel
    obj[rel] = obj[rel] || { rel, items:[]}

    obj[rel].items.push({type:..., sizes:...})

});

Then finally :

head.rel_data = Object.values(obj)
  • Related