For context, I'm writing an application in Python that needs to poll many hosts continuously, so I create a large number of sockets to communicate with those hosts. However, I can only create 511 sockets - when I try to create 512, I get a ValueError: too many file descriptors in select()
. I thought this error was referencing the maximum amount of file descriptors that a process can have open at any given time, but when I try increasing that maximum with Python's win32file._setmaxstdio()
, it has no effect - no matter what I set the limit to, I can only create 511 sockets. I even tried setting the limit to a value lower than 512 just to see if it would change anything, but I could still create 511 sockets! So as far as I can tell, the limits referenced by _setmaxstdio()
and _getmaxstdio()
are completely unrelated to the limit to how many sockets/file descriptors select()
can handle.
I tried investigating Python's select
module to see if I could find where select()
's maximum is defined, or how to increase it. Python's documentation for the select.select()
function doesn't mention either of those things, but it does mention that select()
comes from Windows' Winsock library. So I checked Microsoft's documentation of the select()
function:
Four macros are defined in the header file Winsock2.h for manipulating and checking the descriptor sets. The variable FD_SETSIZE determines the maximum number of descriptors in a set. (The default value of FD_SETSIZE is 64, which can be modified by defining FD_SETSIZE to another value before including Winsock2.h.)
I read this to mean "select()
can handle 64 sockets by default, but you can change that by altering the value of FD_SETSIZE
before you include the header file". So I assume Python sets it to 512 before including the Winsock2 header file? Or is select()
's limit set some other way?
I just want to know where the select()
function's limit is defined, how I can check it, and if it can be increased from within Python, but I'm clearly missing something fundamental here. select()
can handle some number of file descriptors, and _setmaxstdio()
is used to "[set] a maximum for the number of simultaneously open files at the stream I/O level", but changing the limit with _setmaxstdio()
doesn't affect the limit for select()
. Why not? If select()
isn't limited by the maximum amount of file descriptors you're allowed to have, then what is it limited by?
CodePudding user response:
A few days later, my best understanding of the situation is this: the per-process limit to number of file descriptors IS controlled by _setmaxstdio()
, and I was using it correctly, BUT that upper limit set by _setmaxstdio()
can and will be effectively hobbled to a separate, probably-lower limit if you're using the select()
function, but if you instead use poll()
, epoll()
, etc., the limit you set with _setmaxstdio()
will work just as you'd expect. And if there is a way to increase the limits of the select()
function, it seems like you need to mess around with the C runtime, which is something I don't know how to do. Since I have the option to drop support for Windows and only support Unix instead, I'd much rather just do it that way.
If you're reading this answer because you want to increase the limits of Windows' select()
function, and like me, you are new to the concept of file descriptors/completion ports, and haven't extensively worked with sockets before, and don't want to/don't know how to screw around with the C runtime (CRT), I suggest you first consider the following before continuing trying to alter the select()
limit:
- Can you use
poll()
orepoll()
instead ofselect()
? In my case, the only reason I was usingselect()
is because I was locked into because of the libraries I'm using - asyncio and psycopg. At the time of writing, psycopg does not support asyncio'sProactorEventLoop
, so I was forced to instead use theSelectorEventLoop
(which usesselect()
) instead of the defaultProactorEventLoop
, which doesn't useselect()
. However, psycopg has no such limitation in Unix - it will work with any asyncio event loop there. So if you can usepoll()
,epoll()
, etc instead ofselect()
, then that's probably a lot easier than trying to actually increase theselect()
limit since then you'll be able to just use_setmaxstdio()
to set the file descriptor limit, and it'll actually let you have more file descriptors.
Instead of continuing to try to make this work with Windows, I'm just going to use Unix instead - I probably should have done that from the beginning since it's a backend kind of process, but the convenience of being able to run/test/debug the code directly on my development machine was tempting, and I thought that the added flexibility of supporting both Windows and Unix would be worth the (what I thought would be minimal) overhead of adding if os is Windows use SelectorEventLoop
to the start of my application.