How to add realtime timestamp column in spark DF while writing from kafka to db using spark structured streaming? i want exact timestamp when particular row was written into DB.
CodePudding user response:
Something like this should work
df.withColumn("sourceFile", F.current_timestamp()) \
.write
.format("your database")
.mode("overwrite")
CodePudding user response:
Add DEFAULT
timestamp NOW()
value to the the database table row/record schema. Spark cannot create an exact timestamp when database was actually written, only when executor got a batch for the database writer