I need to compare elements by rows of c1 and c2 columns in the DataFrame and return higher value in new column. Column "Result" should return [6,5,4,4,5]
df = DataFrame(c1=[1,2,3,4,5], c2=[6,5,4,3,2])
println(df)
if broadcast(.>, df.c1, df.c2)
df[:, "Result"] .= df.c1
else
df[:, "Result"] .= df.c2
end
println(df5)
ERROR: TypeError: non-boolean (BitVector) used in boolean context
CodePudding user response:
An alternative is:
julia> df.Result = max.(df.c1, df.c2)
5-element Vector{Int64}:
6
5
4
4
5
(as some users prefer such code than higher order functions presented excellently by @Shanyan)
CodePudding user response:
Using eachrow
julia> maximum.(eachrow(df))
5-element Vector{Int64}:
6
5
4
4
5
or as DataFrame
julia> DataFrame(new = maximum.(eachrow(df)))
5×1 DataFrame
Row │ new
│ Int64
─────┼───────
1 │ 6
2 │ 5
3 │ 4
4 │ 4
5 │ 5
or as a new column the DataFrame
julia> df.Result = maximum.(eachrow(df))
julia> df
5×3 DataFrame
Row │ c1 c2 Result
│ Int64 Int64 Int64
─────┼──────────────────────
1 │ 1 6 6
2 │ 2 5 5
3 │ 3 4 4
4 │ 4 3 4
5 │ 5 2 5
CodePudding user response:
You can use select
:
julia> select(df, All() => ByRow(max) => :Result)
5×1 DataFrame
Row │ Result
│ Int64
─────┼────────
1 │ 6
2 │ 5
3 │ 4
4 │ 4
5 │ 5
Another alternative is using DataFramesMeta.jl
:
julia> @select(df, :Result = max.(:c1, :c2))
5×1 DataFrame
Row │ Result
│ Int64
─────┼────────
1 │ 6
2 │ 5
3 │ 4
4 │ 4
5 │ 5
# Alternatively, you can use the following line to avoid mentioning the column names manually:
@select(df, :Result = $(max.(propertynames(df)...)))
# Gives the same result.
If you want to make the change in place, then use select!
or @select!
.
If you prefer the returned dataframe to contain c1
and c2
columns as well, then you can go for transform
(its alternative in-place operator is transform!
):
julia> transform(df, All() => ByRow(max) => :Result)
5×3 DataFrame
Row │ c1 c2 Result
│ Int64 Int64 Int64
─────┼──────────────────────
1 │ 1 6 6
2 │ 2 5 5
3 │ 3 4 4
4 │ 4 3 4
5 │ 5 2 5
# And the same thing using DataFramesMeta.jl:
julia> @transform(df, :Result = $(max.(propertynames(df)...)))
5×3 DataFrame
Row │ c1 c2 Result
│ Int64 Int64 Int64
─────┼──────────────────────
1 │ 1 6 6
2 │ 2 5 5
3 │ 3 4 4
4 │ 4 3 3
5 │ 5 2 2