I have multiple flink jobs which has the same source of input kafka topic and the output format is also same.
Source -> flink job 1 -> output
Source -> flink job 2 -> output
Source -> flink job 3 -> output
Source -> flink job 4 -> output
.
.
.
Source -> flink job n -> output
output format is like Object(pk: String, variable1: String, variable2: Boolean)
I want to consume all the output and make the combined output let's say json of output array
Final required output (pk: String, variable1: List[String], variable2: List[Boolean])
P.S. - Some flink jobs might not return output for input as per implemented flink jobs logic and I am using scala as a language
CodePudding user response:
I managed to solve this by creating one more flink job which act as master job.
Input to this job is output of other N jobs. Since, those jobs were having filter(condition)
I added one more datastream with filter(!condition)
to make sure every job returns the output.
Also, added one datastream in master job which maintains the total job count and connected
it with master job datastream. Representation of the same is in the following diagram.
Flow of solution