I am trying to 1) retrieve hundreds of separate json files from this website https://bioguide.congress.gov/ that contains legislators in the U.S., 2) process them and 3) combine them into a big json that contains all the individual records.
Some of the files I am working with (each individual legislator has a different url that contains their data in a json file format) can be found in these urls:
https://bioguide.congress.gov/search/bio/F000061.json
https://bioguide.congress.gov/search/bio/F000062.json
https://bioguide.congress.gov/search/bio/F000063.json
https://bioguide.congress.gov/search/bio/F000064.json
https://bioguide.congress.gov/search/bio/F000091.json
https://bioguide.congress.gov/search/bio/F000092.json
My approach is to create a for loop to loop over the different ids and combine all the records in an array of objects. Unfortunately, I am stuck trying to access the data.
So far, I have tried the following methods but I am getting a CORS error.
Using fetch:
url = "https://bioguide.congress.gov/search/bio/F000061.json"
fetch(url)
.then((res) => res.text())
.then((text) => {
console.log(text);
})
.catch((err) => console.log(err));
Using the no-cors mode in fetch and getting an empty response:
url = "https://bioguide.congress.gov/search/bio/F000061.json"
const data = await fetch(url, { mode: "no-cors" })
Using d3:
url = "https://bioguide.congress.gov/search/bio/F000061.json"
const data = d3.json(url);
I am getting a CORS related error blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
with all of them.
I would appreciate any suggestions and advice to work around this issue. Thanks.
CodePudding user response:
Well, you're getting a CORS
(Cross-Origin Resource Sharing) error because the website you're sending an AJAX request to (bioguide.congress.gov) has not explicitly enabled CORS
, which means that you can't send AJAX requests (client-side) to that website because of security reasons.
If you want to send a request to that site, you must send a request from the server-side (such as PHP, Node, Python, etc).
CodePudding user response:
Following on from what @code says in their answer, here's a contrived (but tested) NodeJS example that gets the range of data (60-69) from the server once a second, and compiles it into one JSON file.
import express from 'express';
import fetch from 'node-fetch';
import { writeFile } from 'fs/promises';
const app = express();
const port = process.env.PORT || 4000;
let dataset;
let dataLoadComplete;
app.listen(port, () => {
console.log(`Server running on port ${port}`);
});
function getData() {
return new Promise((res, rej) => {
// Initialise the data array
let arr = [];
dataLoadComplete = false;
// Initialise the page number
async function loop(page = 0) {
try {
// Use the incremented page number in the url
const uri = `https://bioguide.congress.gov/search/bio/F00006${page}.json`;
// Get the data, parse it, and add it to the
// array we set up to capture all of the data
const response = await fetch(uri);
const data = await response.json();
arr = [ ...arr, data];
console.log(`Loading page: ${page}`);
// Call the function again to get the next
// set of data if we've not reached the end of the range,
// or return the finalised data in the promise response
if (page < 10) {
setTimeout(loop, 1000, page);
} else {
console.log('API calls complete');
res(arr);
}
} catch (err) {
rej(err);
}
}
loop();
});
}
// Call the looping function and, once complete,
// write the JSON to a file
async function main() {
const completed = await getData();
dataset = completed;
dataLoadComplete = true;
writeFile('data.json', JSON.stringify(dataset, null, 2), 'utf8');
}
main();
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>