Home > Mobile >  Pivoting streaming dataframes without aggregation in pyspark
Pivoting streaming dataframes without aggregation in pyspark

Time:02-05

ID type value
A car camry
A price 20000
B car tesla
B price 40000

Example dataframe that is being streamed.

I need output to look like this. Anyone have suggestions?

ID car price
A camry 20000
B tesla 40000

Whats a good way to transform this? I have been researching pivoting but it requires an aggregation which is not something I need.

CodePudding user response:

You could filter the frame (df) twice, and join

(
    df.filter(df.type=="car").withColumnRenamed("value","car")
    .join(
        df.filter(df.type=="price").withColumnRenamed("value","price")
        , on="ID"
    )
    .select("ID", "car", "price")
)
  • Related