I have a gz
file which I downloaded from here, using HTTP
. Now I want to read the xlsx
file contained in the gz
file and convert it to a DataFrame
. I tried this:
julia> using HTTP, XLSX, DataFrames, GZip
julia> file = HTTP.get("http://www.tsetmc.com/tsev2/excel/IntraDayPrice.aspx?i=35425587644337450&m=30")
julia> write("c:/users/shayan/desktop/file.xlsx.gz", file.body);
julia> df = GZip.open("c:/users/shayan/desktop/file.xlsx.gz", "r") do io
XLSX.readxlsx(io)
end
But this throws a MethodError
:
ERROR: MethodError: no method matching readxlsx(::GZipStream)
Closest candidates are:
readxlsx(::AbstractString) at C:\Users\Shayan\.julia\packages\XLSX\FFzH0\src\read.jl:37
Stacktrace:
[1] (::var"#23#24")(io::GZipStream)
@ Main c:\Users\Shayan\Documents\Python Scripts\test.jl:15
[2] gzopen(::var"#23#24", ::String, ::String)
@ GZip C:\Users\Shayan\.julia\packages\GZip\JNmGn\src\GZip.jl:269
[3] open(::Function, ::Vararg{Any})
@ GZip C:\Users\Shayan\.julia\packages\GZip\JNmGn\src\GZip.jl:265
[4] top-level scope
@ c:\Users\Shayan\Documents\Python Scripts\test.jl:14
CodePudding user response:
XLSX.jl does not work on streams. So you would need to ungzip the file to some temporary location and then read it.
tname = tempname() * ".xlsx"
GZip.open("c://temp//journals.xlsx.gz", "r") do io
open(tname, "w") do out
write(out, read(io))
end
end
df = XLSX.readxlsx(tname)