Home > Mobile >  Preventing Thrashing with Multithreaded/Multicore Processes
Preventing Thrashing with Multithreaded/Multicore Processes

Time:10-12

I'm running chains of programs, many of which like to make their own decisions about how many cores or threads to use and I have some control over how data is partitioned.

I was hoping this would be a fire and forget situation... as in the operating system would just put thread and process spawning on hold until enough resources freed up... but alas, instead a lot of competition for resources ensued.

  1. Are there any operating systems or OS settings (Linux in particular) that are equipped to deal with an explosion in processes/threads and avoid thrashing?

  2. Are there any guidelines on how to parallelize a workflow that is embarrassingly parallel across many steps and many levels? Are there any tools that help devise a strategy based on a scheduling paradigm?

CodePudding user response:

Are there any operating systems or OS settings (Linux in particular) that are equipped to deal with an explosion in processes/threads and avoid thrashing?

Threads/Processes are OS resources and like nearly all OS resources, they are expensive. This is especially true for processes since a context switch from one process to another has a pretty big overhead (eg. TLB flush and possibly a direct/delayed cache flush) and they generally operate on different part of the memory.

Using many threads in one process is generally not much a problem as long as they are not all ready to be scheduled at the same time. If so, the scheduler needs to map them on available cores and this scheduling is a quite expensive. In fact, Scheduling problems are generally NP-complete though heuristics are used in practice. The scheduler needs to take into account many parameter such as IOs, locks/wait, locality, affinity, fairness, priorities, etc. Additionally, each thread has its own stack (generally few MiB) so the number of threads needs not to be too big so not to take too much memory. Contexts switches from one thread to another should still cause some cache issue due to the stack to be in different location in memory and they can be quickly flushed. Thrashing tends to happens more frequently if threads operates on different datasets rather than operating on the same problem and benefit from shared memory through synchronization can be expensive too so the granularity need to be carefully tuned.

Note that you can tune the scheduler on Linux (typically the IO scheduler) but while some scheduler may behave better than others for your target application none are perfect. Application-level scheduling tends to be much more effective in practice.

Are there any guidelines on how to parallelize a workflow that is embarrassingly parallel across many steps and many levels? Are there any tools that help devise a strategy based on a scheduling paradigm?

This is hard to help you without more information, but you can schedule the work yourself on a pool of worker threads (typically the number of physical or logical cores). You can use green-threads (like fibers) or tasks for that. Task scheduling is good for many reasons: you can specify dependencies between tasks, switching from one task to another is usually cheaper than fiber context switch, the stack can be reused for many tasks (and be kept in the cache), you can tune the scheduling of the tasks based on your target application. That being said, task scheduling is good only if tasks do not wait for each other: they need to be split in multiple tasks in this case (ie. continuation). This is not always possible nor simple (eg. call to external libraries). Fibers are better in this specific case (but they have also some issues).

  • Related