How can one initialize a column in a DataFrame with missing values and then fill some elements of that column in later with Float values?
julia> df = DataFrame(:a => rand(4), :b => rand(4))
4×2 DataFrame
Row │ a b
│ Float64 Float64
─────┼────────────────────
1 │ 0.840074 0.673613
2 │ 0.98867 0.33807
3 │ 0.433315 0.150228
4 │ 0.495254 0.833268
julia> insertcols!(df, :c => missing)
4×3 DataFrame
Row │ a b c
│ Float64 Float64 Missing
─────┼─────────────────────────────
1 │ 0.840074 0.673613 missing
2 │ 0.98867 0.33807 missing
3 │ 0.433315 0.150228 missing
4 │ 0.495254 0.833268 missing
julia> for row in eachrow(df)
if rand() > 0.5 #based on processing of the row
row[:c] = 1.0
end
end
ERROR: MethodError: convert(::Type{Union{}}, ::Float64) is ambiguous.
CodePudding user response:
One can do this the following way -
df.c = Vector{Union{Float64,Missing}}(missing, size(df, 1))
CodePudding user response:
This is the way I normally do it:
julia> using DataFrames
julia> df = DataFrame(:a => rand(4), :b => rand(4))
4×2 DataFrame
Row │ a b
│ Float64 Float64
─────┼────────────────────
1 │ 0.388546 0.522189
2 │ 0.232263 0.102722
3 │ 0.519866 0.578753
4 │ 0.493797 0.146636
julia> df.c = missings(Float64, nrow(df))
4-element Vector{Union{Missing, Float64}}:
missing
missing
missing
missing
julia> df
4×3 DataFrame
Row │ a b c
│ Float64 Float64 Float64?
─────┼──────────────────────────────
1 │ 0.388546 0.522189 missing
2 │ 0.232263 0.102722 missing
3 │ 0.519866 0.578753 missing
4 │ 0.493797 0.146636 missing
see also https://bkamins.github.io/julialang/2021/09/03/missing.html for more examples of working with missing
values.