Trying to process multiple csv files using MultiResourceItemReader and reading each file using FlatFileItemReader. When trying to process step with task executor, getting two issues:
- ReaderNotOpenException. This issue occurs because multiple files being processed in multithreaded environment and there may be situation where one thread tries to read from a csv file which was closed by some other thread. To solve this issue I am thinking to use SynchronizedItemStreamReader as below:
MultiResourceItemReader<activeForecast> resourceItemReader = new MultiResourceItemReader<>();
resourceItemReader.setResources(DataSyncUtils.getFileName(activeForecastFilePath));
resourceItemReader.setDelegate(activeForecastReader());
SynchronizedItemStreamReader<activeForecast> synchronizedItemStreamReader = new SynchronizedItemStreamReader<>();
synchronizedItemStreamReader.setDelegate(resourceItemReader);
return synchronizedItemStreamReader;
- FlatFileItemReader gives inconsistent results when used with task executor. So to overcome this I am planning to use SynchronizedItemStreamReader as suggested here: Can I use FlatfileItemReader with Taskexecutor?
My question: Do I need to use SynchronizedItemStreamReader with both FlatFileItemReader and MultiResourceItemReader? If yes why and how?
@Bean("activeForecastitemReader")
@StepScope
public MultiResourceItemReader<activeForecast> activeForecastItemReader() {
MultiResourceItemReader<activeForecast> resourceItemReader = new MultiResourceItemReader<>();
resourceItemReader.setResources(DataSyncUtils.getFileName(activeForecastFilePath));
resourceItemReader.setDelegate(activeForecastReader());
return resourceItemReader;
}
public FlatFileItemReader<activeForecast> activeForecastReader() {
FlatFileItemReader<activeForecast> flatFileItemReader = new FlatFileItemReader<>();
flatFileItemReader.setName("activeForecast-Reader");
flatFileItemReader.setLineMapper(activeForecastLineMapper());
flatFileItemReader.setLinesToSkip(1);
flatFileItemReader.setSaveState(false);
return flatFileItemReader;
}
CodePudding user response:
The MultiResourceItemReader
reads files in sequence. So it does not make sense to me to use it in a multi-threaded step.
If you want to go parallel with multiple threads, I believe a partitioned step is a better option. In this cases, you make each worker thread process a distinct file.