Home > Software engineering >  Unable to read accents of "latin1" encoded file with d3.dsv
Unable to read accents of "latin1" encoded file with d3.dsv

Time:03-29

I have a csv file (separated by semicolons so i use d3.dsv to read it)

d3.dsv(";", "Project.csv", function(d) {
    console.log(d)
});

But i get "�" where the accents are.

I know the file is "latin_1" encoded so i tried this :

dsvReader = d3.dsv(";", "iso-8859-1");
dsvReader("Project.csv", function(d) {
    console.log(d)
});

But i get the following errors :

> net::ERR_FILE_NOT_FOUND
> Uncaught (in promise) TypeError: Failed to fetch

CodePudding user response:

Your first snippet: The d3-fetch function d3.dsv uses response.text() on the result of fetch, which only handles UTF-8.

Your second snippet: You try to fetch the file named iso-8859-1, which presumably does not exist. There is no argument in d3.dsv which accepts encoding.

One solution is to make sure the server has a UTF-8 file, and not a Latin-1 file. This is the simplest solution: in 2022, there are very few legitimate reasons to not use Unicode.

Another is to use fetch yourself, or use d3.buffer or d3.blob, then convert the response to UTF-8 yourself, as described in Fetching non-utf8 data with XMLHttpRequest using TextDecoder (on buffer) or .readAsText (on blob), and finally use d3.dsvFormat(",").parse on the now correct data.

  • Related