Say I have a jupyter notebook:
%%julia
using Pkg
Pkg.add("DecisionTree")
using DecisionTree
X = Vector([1.1,2.2,3.3])
Y = Vector([1.1,2.2,3.3])
X = reshape(X, size(X))
X = Float32.(X)
Y = Float32.(Y)
print(typeof(X))
print(typeof(Y))
model = DecisionTree.build_forest(Y, X')
From what I know DecisionTree.jl uses multithreading, which pycall does not support, which results in the error:
RuntimeError: <PyCall.jlwrap (in a Julia function called from Python)
JULIA: TaskFailedException
Stacktrace:
[1] wait
@ .\task.jl:334 [inlined]
[2] threading_run(func::Function)
@ Base.Threads .\threadingconstructs.jl:38
[3] macro expansion
@ .\threadingconstructs.jl:97 [inlined]
[4] build_forest(labels::Vector{Float32}, features::LinearAlgebra.Adjoint{Float32, Vector{Float32}}, n_subfeatures::Int64, n_trees::Int64, partial_sampling::Float64, max_depth::Int64,
My question is - is there any way to make it work after all?
CodePudding user response:
nested task error: BoundsError: attempt to access 1×3 adjoint(::Vector{Float32}) with eltype Float32 at index [[2, 1], 1:3]
Stacktrace:
[1] throw_boundserror(A::LinearAlgebra.Adjoint{Float32, Vector{Float32}}, I::Tuple{Vector{Int64}, Base.Slice{Base.OneTo{Int64}}})
@ Base ./abstractarray.jl:691
this is the error, if you run this code in julia
CodePudding user response:
The problem has nothing to do with calling it from Python, but from the fact that you are trying to make a model where the features is a single record with 3 dimensions and the label is a 3 (records) vector. DecisionTrees expects indeed the input to be a column vector of dimension nRecords for the label and a nRecods by nDimensions matrix for the features.
For example:
julia> X = [1.1,2.2,3.3]
3-element Vector{Float64}:
1.1
2.2
3.3
julia> Y = [1.1,2.2,3.3]
3-element Vector{Float64}:
1.1
2.2
3.3
julia> X = reshape(X,3,1) # reshape to a single column **matrix**
3×1 Matrix{Float64}:
1.1
2.2
3.3
julia> model = DecisionTree.build_forest(Y, X)
Ensemble of Decision Trees
Trees: 10
Avg Leaves: 1.0
Avg Depth: 0.0
Also, to make a vector you don't need to specify "Vector". I suggest you to have a look on my tutorial on Julia or on my course on Scientific Programming and Machine Learning with Julia (I completed it just a couple of days ago, I still need to "clean" it before announcing it)