I just fund this code in the wild
def _scan_for_self(self):
win32api.Sleep(2000) # sleep to give time for process to be seen in system table.
basename = self.cmdline.split()[0]
pids = win32process.EnumProcesses()
if not pids:
UserLog.warn("WindowsProcess", "no pids", pids)
for pid in pids:
try:
handle = win32api.OpenProcess(
win32con.PROCESS_QUERY_INFORMATION | win32con.PROCESS_VM_READ,
pywintypes.FALSE, pid)
except pywintypes.error, err:
UserLog.warn("WindowsProcess", str(err))
continue
try:
modlist = win32process.EnumProcessModules(handle)
except pywintypes.error,err:
UserLog.warn("WindowsProcess",str(err))
continue
the line
win32api.Sleep(2000) # sleep to give time for process to be seen in system table.
caught my eye, it suggest that if you call EnumProcesses() too fast after starting, you won't see yourself, is there any truth to this?
CodePudding user response:
The documentation for
EnumProcesses
(WIn32 API - EnumProcesses function), does not mention anything about a delay needed to see the current process in the list it returns.The example from Microsoft how to use
EnumProcess
to enumerate all running processes (Enumerating All Processes), also does not contain any delay before callingEnumProcesses
.A small test application I created in C (see below) always reports that the current process is in the list (tested on Windows 10):
#include <Windows.h>
#include <Psapi.h>
#include <iostream>
#include <vector>
const DWORD MAX_NUM_PROCESSES = 4096;
DWORD aProcesses[MAX_NUM_PROCESSES];
int main(void)
{
// Get the list of running process Ids:
DWORD cbNeeded;
if (!EnumProcesses(aProcesses, MAX_NUM_PROCESSES * sizeof(DWORD), &cbNeeded))
{
return 1;
}
// Check if current process is in the list:
DWORD curProcId = GetCurrentProcessId();
bool bFoundCurProcId{ false };
DWORD numProcesses = cbNeeded / sizeof(DWORD);
for (DWORD i=0; i<numProcesses; i)
{
if (aProcesses[i] == curProcId)
{
bFoundCurProcId = true;
}
}
std::cout << "bFoundCurProcId: " << bFoundCurProcId << std::endl;
return 0;
}
Note: I am aware that the fact that the program reported the expected result does not mean that there is no race. Maybe I just couldn't catch it manifest. But trying to run code like that can give you a hint sometimes (especially if the result would have been that there is a race).
The fact that I never had a problem running this test (did it many times), together with the lack of any mention of the need for a delay in Microsoft's documentation make me believe that it is not required.
My conclusion is that either:
- There is a unique issue when using it from python (doubt it).
or: - The code you found is doing something unnecessary.
CodePudding user response:
There is no race.
EnumProcesses
calls a NT API function that switches to kernel mode to walk the linked list of processes. Your own process has been added to the list before it starts running.
CodePudding user response:
There is a race, but it's not the race the code tried to protect against.
A successful call to CreateProcess
returns only after the kernel object representing the process has been created and enqueued into the kernel's process list. A subsequent call to EnumProcesses
accesses the same list, and will immediately observe the newly created process object.
That is, unless the process object has since been destroyed. This isn't entirely unusual since processes in Windows are initialized in-process. The documentation even makes note of that:
Note that the function returns before the process has finished initialization. If a required DLL cannot be located or fails to initialize, the process is terminated.
What this means is that if a call to EnumProcesses
immediately following a successful call to CreateProcess
doesn't observe the newly created process, it does so because it was late rather than early. If you are late already then adding a delay will only make you more late.
Which swiftly leads to the actual race here: Process IDs uniquely identify processes only for a finite time interval. Once a process object is gone, its ID is up for grabs, and the system will reuse it at some point. The only reliable way to identify a process is by holding a handle to it.
Now it's anyone's guess what the author of _scan_for_self
was trying to accomplish. As written, the code takes more time to do something that's probably altogether wrong anyway.