I have a C# console program written with .NET 6. I first wrote the code so that it ran sequentially in one thread. It works. If I look at the task manager, one of my 16 logical processors is very busy and the rest all doing basically nothing. Which is exactly what I expect. The subroutine (ok, it's really a static void method) that does the real work runs thousands of times. I changed the code so that in a loop it queues up the tasks and adds it to a list:
Task t = Task.Run(() => SeekAnswer(param));
tasks.Add(t);
After the loop of queuing is done, we wait until all the parallel threads are done:
Task.WaitAll(tasks.ToArray());
The code works and gives the same answers as the sequential code. Yay. But it was only like 30% faster than the sequential code. I was expecting like a factor of 5 or even 10 speedup. If I look at the task manager, I see that each logical processor is loafing along at around 20% and there's no longer one logical processor that's 100% busy. Because the analysis task, SeekAnswer
, is only CPU bound, i expected each logical processor would be at 100% and the machine would overall start to feel sluggish. SeekAnswer
does no I/O, no disk activity, no networking, no yields or awaits or anything like that. SeekAnswer
takes about 500 ms to run and when it's done, it adds its results to a list.
One guess is that because it's a console app, the console part is stealing cycles where it's waiting to see if there's mouse movement or keyboard inputs or something. If I go into the task manager and set its priority to Real Time, maybe each processor goes from 20% to 23%. Another guess is that the C# task manager/CLR can't fully load down intel hyperthreading logical processors.
My question is what is preventing me from more fully utilizing the processors?
CodePudding user response:
Your current solution splits the work into N tasks. In general, you should prefer higher-level abstractions. Parallel
or PLINQ can probably partition your work into tasks more effectively than your current "one call per task" solution.
when SeekAnswer has an answer, it adds to a common list.
Remove all coordination. If you're collecting results, then just return the results and use something like PLINQ to collect them.
One guess is that because it's a console app, the console part is stealing cycles where it's waiting to see if there's mouse movement or keyboard inputs or something.
No, it won't do that. It listens to signals but there's no polling going on. And even if it was polling, that would show up as CPU usage.
If I go into the task manager and set its priority to Real Time, maybe each processor goes from 20% to 23%.
CPU-bound tasks - rather counterintuitively - should run at a lower priority. The NT scheduling system that underlies all modern Windows OSes will automatically adjust this priority for you.
Another guess is that the C# thread manager/CLR can't fully load down intel hyperthreading logical processors.
Not a limitation of the CLR.
It's important to understand how hyperthreading actually works: there's only one core that can execute code instructions. Hyperthreaded processors work by duplicating the instruction pipelines, and that's it. Each core still only runs one instruction at a time. The benefit of hyperthreading comes when an instruction has to do something like a memory read and would normally stall the processor; in that case the hyperthreading can switch to a different logical processor and continue its instruction pipeline instead.
TL;DR: If your work doesn't ever stall (i.e., it fits into the CPU cache), hyperthreading can't provide any speedups. This is true regardless of language/runtime/OS.
It's possible you might be waiting for some thread injection. For a console application, the minimum threads should be set to the processor count, but those threads are used for some housekeeping tasks. You can experiment with setting the minimum (worker) thread count higher.