So I have been working on the source code of a complex application (written by hundreds of programmers) for a while now. And among other things, I have created some time checking functions, along with suitable data structures to measure execution periods of different segments of the main loop and run some analysis on these measurements.
Here's a pseudocode that helps explaining:
main()
{
TimeSlicingSystem::AddTimeSlice(0);
FunctionA();
TimeSlicingSystem::AddTimeSlice(3);
FuncitonB();
TimeSlicingSystem::AddTimeSlice(6);
PrintTimeSlicingValues();
}
void FunctionA()
{
TimeSlicingSystem::AddTimeSlice(1);
//...
TimeSlicingSystem::AddTimeSlice(2);
}
FuncitonB()
{
TimeSlicingSystem::AddTimeSlice(4);
//...
TimeSlicingSystem::AddTimeSlice(5);
}
PrintTimeSlicingValues()
{
//Prints the different between each slice, and the slice before it,
//starting from slice number 1.
}
Most measurements were very reasonable, for instance assigning a value to a local variable will cost less than a fraction of a microsecond. Most functions will execute from start to finish in a few microseconds, and rarely ever reach one millisecond.
I then ran a few tests for half an hour or so, and I found some strange results that I couldn't quite understand. Certain functions will be called, and when measuring the time from the moment of calling the function (last line in 'calling' code) to the first line inside the 'called' function will take a very long time, up to a 30 milliseconds period. That's happening in a loop that would otherwise complete a full iteration in less than 8 milliseconds.
To get a picture of that, in the pseudocode I included, the time period between the slice number 0, and the slice number 1, or the time between the slice number 3, and the slice number 4 is measured. This the sort of periods I am referring to. It is the measured time between calling a function, and running the first line inside the called function.
QuestionA. Could this behavior be due to thread, or process switching by the OS? Does calling a function is a uniquely vulnerable spot to that? The OS I am working on is Windows 10.
Interestingly enough, there was never a last line in a function returning to the first line after the call in the 'calling' code problem at all ( periods from slice number 2 to 3 or from 5 to 6 in pseudocode)! And all measurements were always less than 5 microseconds.
QuestionB. Could this be, in any way, due to the time measurement method I am using? Could switching between different cores gives an allusion of slower than actually is context switching due to clock differences? (although I never found a single negative delta time so far, which seems to refute this hypothesis altogether). Again, the OS I am working on is Windows 10.
My time measuring function looks looks this:
FORCEINLINE double Seconds()
{
Windows::LARGE_INTEGER Cycles;
Windows::QueryPerformanceCounter(&Cycles);
// add big number to make bugs apparent where return value is being passed to float
return Cycles.QuadPart * GetSecondsPerCycle() 16777216.0;
}
CodePudding user response:
QuestionA. Could this behavior be due to thread, or process switching by the OS?
Yes. Thread switches can happen at any time (e.g. when a device sends an IRQ that causes a different higher priority thread to unblock and preempt your thread immediately) and this can/will cause unexpected time delays in your thread.
Does calling a function is a uniquely vulnerable spot to that?
There's nothing particularly special about calling your own functions that makes them uniquely vulnerable. If the function involves the kernel's API a thread switch can be more likely, and some things (e.g. calling "sleep()
") are almost guaranteed to cause a thread switch.
Also there's potential interaction with virtual memory management - often things (e.g. your executable file, your code, your data) use "memory mapped files" where accessing it for the first time may cause OS to fetch the code or data from disk (and your thread can be blocked until the code or data it wanted arrived from disk); and rarely used code or data can also be sent to swap space and need to be fetched.
QuestionB. Could this be, in any way, due to the time measurement method I am using?
In practice it's likely that Windows' QueryPerformanceCounter()
is implemented with an RDTSC
instruction (assuming 80x86 CPU/s) and doesn't involve the kernel at all, and for modern hardware it's likely that this is monatomic. In theory Windows could emulate RDTSC
and/or implement QueryPerformanceCounter()
in another way to guard against security problems (timing side channels), as has been recommended by Intel for about 30 years now, but this is unlikely (modern operating systems, including but not limited to Windows, tend to care more about performance than security); and in theory your hardware/CPU could be so old (about 10 years old) that Windows has to implement QueryPerformanceCounter()
in a different way, or you could be using some other CPU (e.g. ARM and not 80x86).
In other words; it's unlikely (but not impossible) that the time measurement method you're using is causing any timing problems.