I have the following dataframe
using DataFrame
df = DataFrame([:a=>["1","1","2"], :b=>["a","b","a"], :c=>[123,321,100]])
Is it possible to pivot it such that each row/column is a distinct value? I.e., pivot it to the following
"a" "b"
"1" 123 321
"2" 100 missing
CodePudding user response:
This example is almost literally given in help text for unstack
function. The only caveat is the confusion caused by giving the column names a
, b
, c
in the example which overlaps the values in df[:b]
. Renaming the columns to col_a
, col_b
, col_c
to avoid confusion we have:
julia> df = DataFrame([:col_a=>["1","1","2"], :col_b=>["a","b","a"], :col_c=>[123,321,100]])
3×3 DataFrame
Row │ col_a col_b col_c
│ String String Int64
─────┼───────────────────────
1 │ 1 a 123
2 │ 1 b 321
3 │ 2 a 100
julia> unstack(df, :col_b, :col_c, fill=missing)
2×3 DataFrame
Row │ col_a a b
│ String Int64? Int64?
─────┼─────────────────────────
1 │ 1 123 321
2 │ 2 100 missing
stack
and unstack
are magical functions and after a while one gets the sense of when they should be used to solve seamingly impossible table jugling tasks.