I get the following error message:
pyspark.sql.utils.AnalysisException: It is not allowed to use a window function inside an aggregate function. Please use the inner window function in a sub-query.
How can I "transform" the below code to have it work?
Master_Table_All_2 = Master_Table_All.withColumn("cumulative_paid_in_target_currency_period", F.sum(Master_Table_All.damage_amount_target_currency_in_period.over(Window.partitionBy("key").orderBy("date_end").rowsBetween(-sys.maxsize, 0))))
CodePudding user response:
Try the following:
Master_Table_All_2 = Master_Table_All.withColumn(
"cumulative_paid_in_target_currency_period",
F.sum("damage_amount_target_currency_in_period").over(Window
.partitionBy("key")
.orderBy("date_end")
.rowsBetween(Window.unboundedPreceding, 0)
)
)