CachedThreadPool vs FixedThreadPool for huge number of task-CodePudding

I would like to know which one should i use in the particular scenario: there are several tasks, usually 400k tasks to process. most of the tasks took less than 4 sec to process but some of them (300 to 500 tasks) took long time usually between 10 to 30 min.

Currently, we have FixedThreadPool implemented with size 200. I am wondering if we can do it better with CachedThreadpool? Also want to know what will be the impact on server as only one server is dedicated for the task.

All task performs just calculations. No I/O operations

CodePudding user response：

Thread pool type, in your case, does not impact performance because the cost of thread management is very small compared to each task cost (from 4 s. to 30 m.).

The number of parallel threads that are running is more important. If each task does not perform any I/O operation, probably the correct number of parallel threads is the number of cores of your hardware. If your tasks involve network or disk I/O, it is more difficult to determine the correct level of parallelism to maximize performance.

CodePudding user response：

Starting point

Here's what stands out in your question:

There are ~400,000 tasks to process
Most tasks (~399,500 or ~99.875%) take 4 seconds or less to complete
Some tasks (~500 or ~0.125%) usually take 10-30 minutes to complete
Tasks perform "no I/O operations"
Current approach uses a FixedThreadPool with size 200

Overview

Given that the tasks perform "no I/O operations", this implies:

no disk I/O
no network I/O

The tasks are then bound (limited) either by CPU or memory. A first step would be to understand which of the two is the limiter: CPU or memory.

Nothing in your problem statement sounds like the choice of thread pool is a factor.

Limited by CPU

If the work is CPU-bound, in general nothing will be improved by increasing the thread pool size beyond the number of available CPU cores. So if you have a 32 CPU cores available, a thread pool with a larger number of active threads (for example: 100) would incur overhead (run slower) due to context switching. Things won't "go faster" with more threads if the underlying contended resource is CPU.

With a CPU-bound problem, I would first set the thread pool no higher than total CPU cores on the machine (and probably less). So if your machine has 32 cores, try a thread pool of 16 or maybe 20 to start. Then process real-world data and make obesrvations about performance, possibly making additional changes based on those test runs. Besides your own program, there are always other things running on any computer system, so it isn't a given that 16 (for example) is "low enough" – it depends on what else is running on the system. That's the importance of doing a test run though – set it to 16, look for signs of CPU contention, possibly reduce below 16 if needed; or maybe there's plenty of idle CPU available with 16, so it could be safe/fine to increase higher.

Limited by memory

If the work is memory-bound, the thread pool size isn't as directly tied to the contended resource (like it is with CPU cores). It might take additional understanding to decide if or how to tune the system to avoid memory contention.

As with CPU-bound problems, you should be able to start with a fixed size (something smaller than 200), and make observations using real-world data sets. There should be some pattern that emerges, perhaps (for example) that the ~500 or so 10-30 minute tasks use way more memory than all other tasks.