everyone, I am a Julia newbie. I want to write a Function that delete any column with only zeroes.
function delEmptyCol(df)
emptyColList = []
for col in eachcol(df)
if sum(col) == 0
append!(empyColList,col)
end
end
newdf = select!(df,Not(emptyColList))
return newdf
end
and I made up a trial DataFrame df2
to test my Function.It looks like following.
KFC Mc Piz
Int64 Int64 Int64
1 0 1 4
2 0 2 5
3 0 3 6
So what I hope to get is as following.
Mc Piz
Int64 Int64
1 1 4
2 2 5
3 3 6
However,when i do delEmptyCol(df2)
,I get an error and I have no idea what is wrong.
BoundsError: attempt to access data frame with 3 columns at index [0, 0, 0]
Stacktrace:
[1] getindex
@ C:\Users\cxh\.julia\packages\DataFrames\zqFGs\src\other\index.jl:199 [inlined]
[2] getindex
@ C:\Users\cxh\.julia\packages\DataFrames\zqFGs\src\other\index.jl:257 [inlined]
[3] getindex
@ C:\Users\cxh\.julia\packages\DataFrames\zqFGs\src\other\index.jl:224 [inlined]
[4] manipulate(df::DataFrame, c::InvertedIndex{Vector{Any}}; copycols::Bool, keeprows::Bool, renamecols::Bool)
@ DataFrames C:\Users\cxh\.julia\packages\DataFrames\zqFGs\src\abstractdataframe\selection.jl:1680
[5] #select#492
@ C:\Users\cxh\.julia\packages\DataFrames\zqFGs\src\abstractdataframe\selection.jl:1171 [inlined]
[6] #select!#487
@ C:\Users\cxh\.julia\packages\DataFrames\zqFGs\src\abstractdataframe\selection.jl:873 [inlined]
[7] select!
@ C:\Users\cxh\.julia\packages\DataFrames\zqFGs\src\abstractdataframe\selection.jl:873 [inlined]
[8] delEmptyCol(df::DataFrame)
@ Main .\In[46]:8
[9] top-level scope
@ In[51]:1
[10] eval
@ .\boot.jl:373 [inlined]
[11] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
@ Base .\loading.jl:1196
Please help me !! I'd appreciate!
CodePudding user response:
There might be an easier way than this, but you could use mapcols
to get a dataframe with booleans that say false
if all elements are 0.
Then use this to subset the columns.
julia> df = DataFrame(KFC = [0, 0, 0], Mc = [1,2,3], Piz = [4,5,6])
3×3 DataFrame
Row │ KFC Mc Piz
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 0 1 4
2 │ 0 2 5
3 │ 0 3 6
julia> df[:, Array(mapcols(col -> any(col .!= 0), df)[1, :])]
3×2 DataFrame
Row │ Mc Piz
│ Int64 Int64
─────┼──────────────
1 │ 1 4
2 │ 2 5
3 │ 3 6
I used any
and .!=
to look for columns that have something other than 0 in them. This gives you a dataframe with a single row, which is then extracted and converted to an array.
CodePudding user response:
An alternative to what @niczky12 proposed is:
julia> select(df, all.(!=(0), eachcol(df)))
3×2 DataFrame
Row │ Mc Piz
│ Int64 Int64
─────┼──────────────
1 │ 1 4
2 │ 2 5
3 │ 3 6
(the condition is a bit different than what @moczly12 proposed as I understand you want to drop a column if all its elements are 0, but this is a detail - I understand you are asking about the general approach)