I am trying to read the snowflake stream data using aws lambda (snowflake connector library) and writing the data into RDS SQL server. After the lambda run, my stream data is not getting deleted.
I don't want to read the data from stream and insert it into temporary snowflake table and again read to insert the data in the SQL server. Is there any better way to do this?
Lambda code:
for table in table_list:
sql5 = f"""SELECT "header__stream_position","header__timestamp" FROM STREAM_{table} where "header__operation" in ('UPDATE' ,'INSERT' ,'DELETE') ;"""
result =cs.execute(sql5).fetchall()
rds_columns = [(c[0],c[1],table[:-4]) for c in result]
if rds_columns:
cursor.fast_executemany = True
sql6 = f"INSERT INTO {RDS_TABLE}(LSNNUMBER,TRANSACTIONTIME,TABLENAME) VALUES (?, ?, ?);"
data = (rds_columns)
cursor.executemany(sql6,data)
table_write.append(table)
conn.commit()
ctx.commit()
CodePudding user response:
Snowflake Streams requires a successful committed DML operation to advance the Stream so you can't avoid an intermediate Snowflake table (transient or otherwise) with Streams.
You could use Changes to get the same change information if you can manage the time/query offset within your application code.
CodePudding user response:
The offset on a Stream will only advance if it is consumed by a DML statement. (INSERT,UPDATE,MERGE). There is a read-only version of streams called CHANGES. However, you must keep track of the offsets yourself.
https://docs.snowflake.com/en/sql-reference/constructs/changes.html