Home > Net >  Spark: Use Persistent Table as Streaming Source for Spark Structured Streaming
Spark: Use Persistent Table as Streaming Source for Spark Structured Streaming

Time:10-03

I stored data in a table:

spark.table("default.student").show()

(1) Spark Jobs
 --- ---- --- 
| id|name|age|
 --- ---- --- 
|  1| bob| 34|
 --- ---- --- 

I would like to make a read stream using that table as source. I tried

newDF=spark.read.table("default.student")
newDF.isStreaming

Which returns False.

Is there a way to use a table as Streaming Source?

CodePudding user response:

Need to use delta table. Like this on Databricks Notebook:

data = spark.range(0, 5)
data.write.format("delta").mode("overwrite").saveAsTable("T1")
stream = spark.readStream.format("delta").table("T1").writeStream.format("console").start()

// In another cell, execute:
data = spark.range(6, 10)

In DriverLogs can see 2 sets of data, then.

  • Related