Home > Net >  Julia DataFrame, ArgumentError: broadcasting over `DataFrameRow`s is reserved
Julia DataFrame, ArgumentError: broadcasting over `DataFrameRow`s is reserved

Time:06-14

If I want to add a Vector to each column of a Dataframe, I can broadcast it. If instead I loop over all rows of the same DataFrame and then want to broadcast to all columns, I get the error

ArgumentError: broadcasting over `DataFrameRow`s is reserved

This would be a minimal (non-) working example:

d = DataFrame()

#create the list with column names
columnlist = [:a, :b, :c, :d]

#each column is filled with [0, 0, ...] of length 10
for col in columnlist
    d[!, col] = zeros(10)
end

#then I want to add a "1" to each entry
#this does work
d[!, columnlist] . = ones(10)

#this does produce the error mentioned above
for row in 1:nrow(d)
    dfrow = d[row, :]
    dfrow[columnlist] . = ones(4) #ArgumentError: broadcasting over `DataFrameRow`s is reserved
end

#this also does not work, with the same error
for row in 1:nrow(d)
    d[row, columnlist] . = ones(4) #ArgumentError: broadcasting over `DataFrameRow`s is reserved
end

Is that the expected behaviour, or a bug? Or am I overseeing something stupid? And is there a way how to broadcast to all columns of a DataFrameRow? (I am aware of the fact that Julia is column-major, but as I need to do some heavy interpolation including I/O once for each row, it is faster this way)

I am using Julia 1.7.2 and DataFrames v1.3.4 on Ubuntu 20.04.4.

CodePudding user response:

This is expected. The reason is that Base Julia has not decided yet how to handle broacasting of NamedTuple:

julia> nt = (a=1, b=2, c=3)
(a = 1, b = 2, c = 3)

julia> nt .  1
ERROR: ArgumentError: broadcasting over dictionaries and `NamedTuple`s is reserved

If you want to use broadcasting the way to do it (till Base Julia decides what to do with NamedTuples) is:

for row in 1:nrow(d)
    d[row:row, columnlist] . = ones(1, 4)
end

or just do not use broadcasting but some other method to get the same result:

for row in 1:nrow(d)
    foreach(col -> d[row, col]  = 1, columnlist)
end

Note: this is an answer to your question. I have not focused on making your code most efficient possible. If you have a performance issue can you please ask another question specifying exactly what is your problem and I will try to help.

  • Related