I am trying to export a dataframe to xlsx
df |> XLSX.writexlsx("df.xlsx")
and am getting this error:
ERROR: Unsupported datatype String31 for writing data to Excel file. Supported data types are Union{Missing, Bool, Float64, Int64, Dates.Date, Dates.DateTime, Dates.Time, String} or XLSX.CellValue.
I cannot find anyting about String31
utilizing Google. Is there a "transform" that I need to perform on the df beforehand?
CodePudding user response:
It seems that currently this is a limitation of XSLX.jl. I have opened an issue proposing to fix it.
I assume your df
is not huge, in which case the following solution should work for you:
to_string(x::AbstractString) = String(x)
to_string(x::Any) = x
XLSX.writetable("df.xlsx", to_string.(df))
CodePudding user response:
I guess you're reading the data in from a CSV file? CSV.jl uses fixed-length strings from the InlineStrings.jl package when it reads from a file, which can be much better for managing memory allocation, and hence for performance. For eg.:
julia> df = CSV.read(IOBuffer("S, I, F
hello Julian world, 2, 3.14"), DataFrame)
1×3 DataFrame
Row │ S I F
│ String31 Int64 Float64
─────┼────────────────────────────────────
1 │ hello Julian world 2 3.14
Here, the S
column was assigned a String31
type since that's the smallest type from InlineStrings
that can hold our string.
However, it looks like XLSX.jl doesn't know how to handle InlineString
types, unfortunately.
You can specify when reading the file that you want all strings to be read as just String
type and not the fixed-length InlineString
variety.
julia> df = CSV.read(IOBuffer("S, I, F
hello Julian world, 2, 3.14"), DataFrame, stringtype = String)
1×3 DataFrame
Row │ S I F
│ String Int64 Float64
─────┼────────────────────────────────────
1 │ hello Julian world 2 3.14
Or, you can transform the column just before writing to the file.
julia> transform!(df, :S => ByRow(String) => :S)
1×3 DataFrame
Row │ S I F
│ String Int64 Float64
─────┼────────────────────────────────────
1 │ hello Julian world 2 3.14
If you have multiple InlineString
type columns and don't want to specify them individually, you can also do:
julia> to_string(c::AbstractVector) = c;
julia> to_string(c::AbstractVector{T}) where T <: AbstractString = String.(c);
julia> mapcols!(to_string, df)
1×3 DataFrame
Row │ S I F
│ String Int64 Float64
─────┼────────────────────────────────────
1 │ hello Julian world 2 3.14
(Edited to use multiple dispatch here (based on Bogumil Kaminsky's answer) rather than typecheck in the function.)