Home > Enterprise >  Pivoting DataFrame
Pivoting DataFrame

Time:11-01

I have the following dataframe

using DataFrame
df = DataFrame([:a=>["1","1","2"], :b=>["a","b","a"], :c=>[123,321,100]])

Is it possible to pivot it such that each row/column is a distinct value? I.e., pivot it to the following

    "a" "b"
"1" 123 321
"2" 100 missing

CodePudding user response:

This example is almost literally given in help text for unstack function. The only caveat is the confusion caused by giving the column names a, b, c in the example which overlaps the values in df[:b]. Renaming the columns to col_a, col_b, col_c to avoid confusion we have:

julia> df = DataFrame([:col_a=>["1","1","2"], :col_b=>["a","b","a"], :col_c=>[123,321,100]])
3×3 DataFrame
 Row │ col_a   col_b   col_c 
     │ String  String  Int64 
─────┼───────────────────────
   1 │ 1       a         123
   2 │ 1       b         321
   3 │ 2       a         100

julia> unstack(df, :col_b, :col_c, fill=missing)
2×3 DataFrame
 Row │ col_a   a       b       
     │ String  Int64?  Int64?  
─────┼─────────────────────────
   1 │ 1          123      321
   2 │ 2          100  missing 

stack and unstack are magical functions and after a while one gets the sense of when they should be used to solve seamingly impossible table jugling tasks.

  • Related