Home > Software design >  Cannot unzip nor get blob from HTTP Response
Cannot unzip nor get blob from HTTP Response

Time:03-16

I am trying to unzip a file that is in a "response" of a HTTP Request. My point is that after receiving the response I cannot unzip it nor make it to a blob to parse it afterward. The zip will always return a xml and the idea after the file is unzipped, is to transform the XML to a JSON.

Here is the code I tried:

val client = HttpClient.newBuilder().build();
val request = HttpRequest.newBuilder()
    .uri(URI.create("https://donnees.roulez-eco.fr/opendata/instantane"))
    .build();

val response = client.send(request, HttpResponse.BodyHandlers.ofString());

Then the response.body() is just unreadable and I did not find a proper way to make it to a blob

The other code I used for unzipping directly is this one:

val url = URL("https://donnees.roulez-eco.fr/opendata/instantane")
val con = url.openConnection() as HttpURLConnection
con.setRequestProperty("Accept-Encoding", "gzip")
println("Length : "   con.contentLength)

var reader: Reader? = null
reader = InputStreamReader(GZIPInputStream(con.inputStream))

while (true) {
    val ch: Int = reader.read()
    if (ch == -1) {
        break
    }
    print(ch.toChar())
}

But in this case, it won't accept the gzip

Any idea?

CodePudding user response:

It looks like you're confusing zip (an archive format that supports compression) with gzip (a simple compressed format).

Downloading https://donnees.roulez-eco.fr/opendata/instantane (e.g. with curl) and checking the result shows that it's a zip archive (containing a single file, PrixCarburants_instantane.xml).

But you're trying to decode it as a gzip stream (with GZIPInputStream), which it's not — hence your issue.

Reading a zip file is slightly more involved than reading a gzip file, because it can hold multiple compressed files. But ZipInputStream makes it fairly easy: you can read the first zip entry (which has metadata including its uncompressed size), and then go on to read the actual data in that entry.

A further complication is that this particular compressed file seems to use ISO 8859-1 encoding, not the usual UTF-8. So you need to take that into account when converting the byte stream into text.

Here's some example code:

val zipStream = ZipInputStream(con.inputStream)
val entry = zipStream.nextEntry

val reader = InputStreamReader(zipStream, Charset.forName("ISO-8859-1"))
for (i in 1..entry.size)
    print(reader.read().toChar())

Obviously, reading and printing the entire 11MB file one character at a time is not very efficient! And if there's any possibility that the zip archive could have multiple entries, you'd have to read through them all, stopping when you get to the one with the right name. But I hope this is a good illustration.

  • Related