in notebook1
from pyspark.sql.functions import current_timestamp
def add_ingest_date(input_df):
output_df = input_df.withColumnRenamed("somecolumn",
current_timestamp())
return output_df
in notebook2
%run "../common_functions/add_ingested_date"
final_df = add_ingest_date(input_df)
I get "Column is not iterable" error
CodePudding user response:
You are trying to change the name of a column with name 'somecolumn' to current_timestamp(). This is not proper. If you want to change column name you need to give a string not a function. For example:
output_df = input_df.withColumnRenamed("somecolumn", "newColumnName")
If you want to add a new column which shows current timestamp then you need to specify you are adding a new column to the data frame
output_df = input_df.withColumn("somecolumn", current_timestamp())