Home > Back-end >  Spark Dataframe timestamp column manipulation failing without any error message
Spark Dataframe timestamp column manipulation failing without any error message

Time:12-08

aggregate = aggregate.withColumn('DaysSinceFirstUsage', when(months_between(current_date(), col('FirstUsage')) > 120, - (sys.maxsize - 1)).otherwise(days_between(current_date(), col('FirstUsage')))

aggregate = aggregate.withColumn('DaysSinceLastUsage', when(months_between(current_date(), col('LastUsage')) > 120, - (sys.maxsize - 1)).otherwise(days_between(current_date(), col('LastUsage')))

CodePudding user response:

Silly mistake :) Closing bracket at the end was missing and datediff was wrongly written as days_between. Query running fine after correction.

aggregate = aggregate.withColumn('DaysSinceFirstUsage', when(months_between(current_date(), col('FirstUsage')) > 120, - (sys.maxsize - 1)).otherwise(datediff(current_date(), col('FirstUsage'))))
aggregate = aggregate.withColumn('DaysSinceLastUsage', when(months_between(current_date(), col('LastUsage')) > 120, - (sys.maxsize - 1)).otherwise(datediff(current_date(), col('LastUsage'))))
  • Related