This is a compound question regarding how changing size of the thread pool sizes at run-time affects the spring batch run-time system.
To start I would like to do a verbiage clarification: concurrency = # of running steps and parallelism = # threads per step.
For a clear understanding of how I am using spring batch to do my processing. Currently I have a large number of files(200 ) that are being generated and I am using Spring Batch to transfer the files where each step maps to 1 file. Everything about the job is dynamic, as in the number of steps and each step's reader and writer is distinct to that step. So no step shares readers or writers. There is a thread pool dedicated to running the steps concurrently, and then each step has its own thread pool so we can do parallelism per step. When combined with commit interval this gives great throughput and control.
So my questions are:
- How can I change the number of running steps after the Job has started?
- How can I change the commit interval after a step has started processing?
So lets consider an example of why I would like to do this and what exactly I mean by change "running steps" and "commit interval".
Consider the case you have a total of 300 steps to process with a step thread pool size 5. I begin processing and realize that I have more resources to utilize, I would like to change the thread count to say 8. When I actually do this at run-time what I experience is that the thread pool does increase but the number of running steps does not change. Why is that?
Following a similar logic say I have more memory to utilize, I would then like to increase my commit interval at run-time. I have not found anything in the StepExecution class that would let me change the commit interval surprisingly. Why not?
What is interesting is that for parallelism I am able to change the number of running threads by simply increasing that thread pool's size. From simply changing the number of parallel threads I noticed massive increase in throughput.
If you would like more information I can provide code, and link to the repository.
Thank you very much.
CodePudding user response:
While it is possible to make the commit interval and thread pool size configurable and change them at startup time, it is not possible to change them at runtime (ie "in-flight") once the job execution has started.
Making the commit interval and thread pool size configurable (via application/system properties or passing them as job parameters) will allow you to empirically adapt the values to best utilize your resources without having to recompile/repackage your application.
The runtime dynamism you are looking for is not available by default, but you can always implement the Step
interface and use it as part of a Spring Batch job next to other step types provided out-of-the-box by the framework.