Home > Net >  Running pipelines on a dynamically growing table until all rows are processed
Running pipelines on a dynamically growing table until all rows are processed

Time:08-26

I have a table that I am processing its rows through multiple pipelines. Each of those pipelines will mark a processed column to true for some of the rows that it processes in its current run BUT it can also add MORE rows to the table so each run of pipeline could mark for example three rows as "processed" but also add some more rows to the table. I want these pipelines keep running until all rows in the table are marked as "processed"

How can I design that part? With a UNTIL activity that queries the table to see if there are any on processed rows left?

CodePudding user response:

You can compare the total row count of the table and the count of rows where processed is true. Until they are equal, you can run the necessary pipeline activities inside the Until activity. The following is the demonstration of the same.

  • I have a process pipeline which processes the rows (converts gname column to Upper case), updates processed rows to true, inserts new rows and puts processed value as false (All in one pipeline for demonstration).
#initial table data
id,gname,processed
1,'Ana','false'
2,'Ceb','false'
3,'bzm','false'

#data that will be inserted (processed will be added while inserting into table)
4,'jerax'
5,'Sumail'
  • Now after executing pipeline inside Until activity, perform 2 Script activities where, one script activity will be used to get table row count and the other for count of rows where processed is true
#script for getRowCount
select count(*) as row_count from demo

#script for getTrueCount
select count(*) as true_count from demo where processed='true'

enter image description here

  • Now, for until condition, specify the following dynamic content:
@equals(activity('getRowCount').output.resultSets[0].rows[0].row_count,activity('getTrueCount').output.resultSets[0].rows[0].true_count)
  • When you run the pipeline, it runs until all the processed columns become true (row_count == true_count).

  • Table data after 1st iteration (until condition will be false, @equals(5,3)):

id,gname,processed
1,'ANA','true'
2,'CEB','true'
3,'BZM','true'
4,'jerax','false'
5,'Sumail','false'
  • Table data after 2nd iteration (until condition will become true, @equals(5,5) and run will be completed):
id,gname,processed
1,'ANA','true'
2,'CEB','true'
3,'BZM','true'
4,'JERAX','true'
5,'SUMAIL','true'

enter image description here

So, you can achieve your requirement by comparing the total row count and count of processed column where its value is true (which is achieved using 2 script activities as specified above).

  • Related