Home > Enterprise >  Compare elements in Julia DataFrame and return value in new column
Compare elements in Julia DataFrame and return value in new column

Time:01-10

I need to compare elements by rows of c1 and c2 columns in the DataFrame and return higher value in new column.

Column "Result" should return [6,5,4,4,5]

df = DataFrame(c1=[1,2,3,4,5], c2=[6,5,4,3,2])
println(df)

if broadcast(.>, df.c1, df.c2)
    df[:, "Result"] .= df.c1
else
    df[:, "Result"] .= df.c2
end

println(df5)

ERROR: TypeError: non-boolean (BitVector) used in boolean context

CodePudding user response:

An alternative is:

julia> df.Result = max.(df.c1, df.c2)
5-element Vector{Int64}:
 6
 5
 4
 4
 5

(as some users prefer such code than higher order functions presented excellently by @Shanyan)

CodePudding user response:

Using eachrow

julia> maximum.(eachrow(df))
5-element Vector{Int64}:
 6
 5
 4
 4
 5

or as DataFrame

julia> DataFrame(new = maximum.(eachrow(df)))
5×1 DataFrame
 Row │ new   
     │ Int64 
─────┼───────
   1 │     6
   2 │     5
   3 │     4
   4 │     4
   5 │     5

or as a new column the DataFrame

julia> df.Result = maximum.(eachrow(df))

julia> df
5×3 DataFrame
 Row │ c1     c2     Result 
     │ Int64  Int64  Int64  
─────┼──────────────────────
   1 │     1      6       6
   2 │     2      5       5
   3 │     3      4       4
   4 │     4      3       4
   5 │     5      2       5

CodePudding user response:

You can use select:

julia> select(df, All() => ByRow(max) => :Result)
5×1 DataFrame
 Row │ Result 
     │ Int64  
─────┼────────
   1 │      6
   2 │      5
   3 │      4
   4 │      4
   5 │      5

Another alternative is using DataFramesMeta.jl:

julia> @select(df, :Result = max.(:c1, :c2))
5×1 DataFrame
 Row │ Result 
     │ Int64  
─────┼────────
   1 │      6
   2 │      5
   3 │      4
   4 │      4
   5 │      5

# Alternatively, you can use the following line to avoid mentioning the column names manually:
@select(df, :Result = $(max.(propertynames(df)...)))
# Gives the same result.

If you want to make the change in place, then use select! or @select!.
If you prefer the returned dataframe to contain c1 and c2 columns as well, then you can go for transform (its alternative in-place operator is transform!):

julia> transform(df, All() => ByRow(max) => :Result)
5×3 DataFrame
 Row │ c1     c2     Result 
     │ Int64  Int64  Int64  
─────┼──────────────────────
   1 │     1      6       6
   2 │     2      5       5
   3 │     3      4       4
   4 │     4      3       4
   5 │     5      2       5

# And the same thing using DataFramesMeta.jl:
julia> @transform(df, :Result = $(max.(propertynames(df)...)))
5×3 DataFrame
 Row │ c1     c2     Result 
     │ Int64  Int64  Int64  
─────┼──────────────────────
   1 │     1      6       6
   2 │     2      5       5
   3 │     3      4       4
   4 │     4      3       3
   5 │     5      2       2
  • Related