Home > database >  How to convert 1 row 4 columns dataframe to 4 rows 2 columns dataframe in pyspark or sql
How to convert 1 row 4 columns dataframe to 4 rows 2 columns dataframe in pyspark or sql

Time:11-23

I have a dataframe which returns the output as

enter image description here

I would like to transpose this into

enter image description here

Can someone help to understand how to prepare the pyspark code to achieve this result dynamically. I have tried Unpivot in sql but no luck.

CodePudding user response:

df =spark.createDataFrame([
(78,20,19,90),
],
('Machines',  'Books',  'Vehicles', 'Plants'))

Create a new array of struct column that combines column names and value names. Use the magic inline to explode the struct field. Code below

df.withColumn('tab', F.array(*[F.struct(lit(x).alias('Fields'), col(x).alias('Count')).alias(x) for x in df.columns])).selectExpr('inline(tab)').show()


 -------- ----- 
|  Fields|Count|
 -------- ----- 
|Machines|   78|
|   Books|   20|
|Vehicles|   19|
|  Plants|   90|
 -------- ----- 
  • Related