Home > Blockchain >  How do I read a gzipped XLSX file in Julia?
How do I read a gzipped XLSX file in Julia?

Time:09-01

I have a gz file which I downloaded from here, using HTTP. Now I want to read the xlsx file contained in the gz file and convert it to a DataFrame. I tried this:

julia> using HTTP, XLSX, DataFrames, GZip

julia> file = HTTP.get("http://www.tsetmc.com/tsev2/excel/IntraDayPrice.aspx?i=35425587644337450&m=30")

julia> write("c:/users/shayan/desktop/file.xlsx.gz", file.body);

julia> df = GZip.open("c:/users/shayan/desktop/file.xlsx.gz", "r") do io
           XLSX.readxlsx(io)
       end

But this throws a MethodError:

ERROR: MethodError: no method matching readxlsx(::GZipStream)
Closest candidates are:
  readxlsx(::AbstractString) at C:\Users\Shayan\.julia\packages\XLSX\FFzH0\src\read.jl:37
Stacktrace:
 [1] (::var"#23#24")(io::GZipStream)
   @ Main c:\Users\Shayan\Documents\Python Scripts\test.jl:15
 [2] gzopen(::var"#23#24", ::String, ::String)
   @ GZip C:\Users\Shayan\.julia\packages\GZip\JNmGn\src\GZip.jl:269
 [3] open(::Function, ::Vararg{Any})
   @ GZip C:\Users\Shayan\.julia\packages\GZip\JNmGn\src\GZip.jl:265
 [4] top-level scope
   @ c:\Users\Shayan\Documents\Python Scripts\test.jl:14

CodePudding user response:

XLSX.jl does not work on streams. So you would need to ungzip the file to some temporary location and then read it.

tname = tempname() * ".xlsx"
GZip.open("c://temp//journals.xlsx.gz", "r") do io
    open(tname, "w") do out
        write(out, read(io))
    end
end

df = XLSX.readxlsx(tname)
  • Related