Home > Blockchain >  Dynamically Pivot/Transpose Rows to Columns in Hive/Spark
Dynamically Pivot/Transpose Rows to Columns in Hive/Spark

Time:02-17

I have Quaterly basis Data and Data keeps Growing dynamically as Quater Grows-

qtr       dimvalue  percentage
FY2019-Q1   XYZ         15
FY2019-Q1   ABC         80
FY2019-Q1   PPP         5
FY2019-Q2   XYZ         10
FY2019-Q2   ABC         70
FY2019-Q2   PPP         20

When the Number of Quarters are less i am manually editing the query every time and trying the query as below to transpose it-

SELECT dim_value,SUM(Quater_1) as Quater_1,SUM(Quater_2) as Quater_2 from
(
SELECT dim_value,
CASE WHEN qtr='FY2019-Q1' THEN percentage END AS Quater_1,
CASE WHEN qtr='FY2019-Q2' THEN percentage END AS Quater_2 FROM 
( select * from schema.table where qtr in ('FY2019-Q1','FY2019-Q2'))t2 order by dim_value
)t1 group by dim_value;


dimvalue   Quater_1   Quater_2
XYZ          15         10
ABC          80         70 
PPP          5          20
 

But my Query is how can i active this in a dynamic way and more robust way to transpose rows into columns and keeping in mind the growing quaters and also have proper Quaterwise column names as the Quater grows.

Altogether i am looking for how can perform this using a more dynamic Query be it using Hive or Spark-SQL or if any suggestions to perform it?

Thanks for the Help

CodePudding user response:

You could easily do such pivot using Dataset API if that's doable for you.

spark.table("schema.table").groupBy("dimvalue").pivot("qtr").sum("percentage").show

 -------- --------- --------- 
|dimvalue|FY2019-Q1|FY2019-Q2|
 -------- --------- --------- 
|     PPP|        5|       20|
|     XYZ|       15|       10|
|     ABC|       80|       70|
 -------- --------- --------- 

With SQL the only way is to build it dynamically.

  • Related