I'm using spark 2.3 with scala 2.11.8.
I have a dataframe as below, where x1 and x2 are the types and I have their individual counts in their respective columns x1cnt and x2cnt.
The expected dataframe as shown below, needs to have have the column 'type' that has x1 and x2 for each record and the column 'count' with their respective count.
The example only has two types, but there will be more.
Input DataFrame:
-------------------- ----------------------- ----- -----
| col1| col2|x1cnt|x2cnt|
-------------------- ----------------------- ----- -----
| 1| 17| 2| 4|
| 1| 21| 0| 6|
| 1| 917| 0| 8|
| 1| 1| 35| 55|
| 1| 901| 0| 0|
| 1| 902| 0| 74|
-------------------- ----------------------- ----- -----
Expected result,
Expected DataFrame:
-------------------- ----------------------- ----- -----
| col1| col2| type|count|
-------------------- ----------------------- ----- -----
| 1| 17| x1| 2|
| 1| 17| x2| 4|
| 1| 21| x1| 0|
| 1| 21| x2| 6|
| 1| 917| x1| 0|
| 1| 917| x2| 8|
| 1| 1| x1| 35|
| 1| 1| x2| 55|
| 1| 901| x1| 0|
| 1| 901| x2| 0|
| 1| 902| x1| 0|
| 1| 902| x2| 74|
-------------------- ----------------------- ----- -----
Any help is appretiated.
CodePudding user response:
the STACK
function acts like a reverse PIVOT
select
col1
, col2
, stack(2, 'x1', x1cnt, 'x2', x2cnt)
from
table;