I want to read records (1000k) from 1 table and push them to some service.
So I have clubbed 200 records(based on the service limitations) in 1 event and used the executor framework and have created 10 executors. 10 events will be processed (i.e. 10*200 records) parallelly.
Now I want to maintain the status of these events, like statistics on how many were processed successfully and how many failed.
So I was thinking of
Approach 1: Before starting the execution, writing each event id record id with status
event1 record1 -> start
and on completion
event1 record1-> end
And later will check how many have both start and end in the file and how many do not have end.
Approach 2 : Write all record ids in one file with status pending and writing all successful records in another file
And then check for the missing in the successful file by using pivot
Is there a better way to maintain the status of the records?
CodePudding user response:
In my view, if you want to process items parallelly, then it is better to create a log file by your amount of records. Why? Because one file is a bottleneck for multithreading, because you need to lock file to prevent conditon race. If you will decide to lock file, then each thread should wait when log file will be released and waiting of file will nullify all multithreading processing.
So one batch should have one log file.
CodePudding user response:
Create an array and start threads with the passed id so they can write to the array cell by their id.
The main thread will read this array and print it.
You can use ReadWriteLock (threads will hold the read lock to write and the main thread will hold the write lock while reading the entire array).
You can store anything in this array, it can be very useful.