Home > Software design >  Merge output of multiple flink jobs and return single output
Merge output of multiple flink jobs and return single output

Time:09-28

I have multiple flink jobs which has the same source of input kafka topic and the output format is also same.

Source -> flink job 1 -> output
Source -> flink job 2 -> output
Source -> flink job 3 -> output
Source -> flink job 4 -> output
.
.
.
Source -> flink job n -> output

output format is like Object(pk: String, variable1: String, variable2: Boolean)

I want to consume all the output and make the combined output let's say json of output array

Final required output (pk: String, variable1: List[String], variable2: List[Boolean])

P.S. - Some flink jobs might not return output for input as per implemented flink jobs logic and I am using scala as a language

CodePudding user response:

I managed to solve this by creating one more flink job which act as master job. Input to this job is output of other N jobs. Since, those jobs were having filter(condition) I added one more datastream with filter(!condition) to make sure every job returns the output. Also, added one datastream in master job which maintains the total job count and connected it with master job datastream. Representation of the same is in the following diagram. Flow of solution

  • Related