"Column is not iterable" error while calling a function from another imported notebook-CodePudding

in notebook1

from pyspark.sql.functions import current_timestamp
def add_ingest_date(input_df):
output_df = input_df.withColumnRenamed("somecolumn", 
current_timestamp())
return output_df

in notebook2

%run "../common_functions/add_ingested_date"
final_df = add_ingest_date(input_df)

I get "Column is not iterable" error

CodePudding user response：

You are trying to change the name of a column with name 'somecolumn' to current_timestamp(). This is not proper. If you want to change column name you need to give a string not a function. For example:

output_df = input_df.withColumnRenamed("somecolumn", "newColumnName")

If you want to add a new column which shows current timestamp then you need to specify you are adding a new column to the data frame

output_df = input_df.withColumn("somecolumn", current_timestamp())