bind() fails after using SO_REUSEADDR option for wildcard socket in state TIME

I am running my server application on Linux. My server uses a socket that is bound to an address *::<some_specific_port> (where * means a wildcard ip address).

My program can be destroyed (socket will be closed with close()) or crashed by some external signal.

And I want to restart my application ASAP, without being care about the reliability of tcp (I take care of that in some higher level). When I load my server I use the exact same address (*::<same_exact_port>) but calling bind() syscall fails with errno=EADDRINUSE which means address is already in use.

I looked it up, and saw that the socket is in TIME_WAIT state. After reading a little bit I found out about the reusing address issue in Linux and tcp. But as I said before in my case I don't really care about the reliability, all I care about is to restart my program (that always uses a wildcard ip and the same port) as soon as possible.

I tried to use SO_REUSEADDR and set linger time to 0, but the problem keeps happening. I have seen the SO_REUSEPORT option which seems to solve my problem, but I prefer to avoid using it as much as I can (for security purposes).

I read about the net.ipv4.tcp_tw_reuse option in Linux but the documentation is very vague and unclear. I noticed my machine is configured to net.ipv4.tcp_tw_reuse=0 and I was wondering if enabling this flag would help.

Or maybe the flag is not related and I miss something else.

I have seen this post How do SO_REUSEADDR and SO_REUSEPORT differ?, with a great answer about this topic, but I still don't understand if I can bind the exact same address (wildcard and same port) when the older socket is in TIME_WAIT state and the new socket is set with SO_REUSEADDR in Linux.

CodePudding user response：

Setting the Linger time to zero will cause your socket not to wait for unsent data to still be sent, however it will only avoid TIME_WAIT state completely if the other end already closed its read pipe.

A socket can be seen as two pipes. A read pipe and a write pipe. Your read pipe is connected to the write pipe of the other side and your write pipe is connected to the read pipe of the other side. When you open a socket, both pipes are opened and when you close a socket, both pipes are closed. However, you can close individual pipes using the shutdown() call.

When you use shutdown to close your write pipe (SHUT_WR or SHUT_RDWR), your socket will end up in TIME_WAIT, even if Linger time is zero. And when you call close() on a socket, it will implicitly close both pipes, unless already closed, and if it did close the write pipe, it will have to wait, even if it dropped any pending data from the send buffer.

If the other side calls close() first or at least calls shutdown() with SHUT_RD and after that you call close(), you can only end up in TIME_WAIT state for as long as the configured Linger time and if that time is zero, you won't end up in that state at all.

But if your app crashes or is killed in the middle of a TCP transmission, both pipes are open, the system will implicitly call close on your socket, closing both pipes and thus forcing you into TIME_WAIT, no matter the Linger time. Calling close yourself in a signal handler would make no difference either.

As for SO_REUSEADDR, this setting does not allow reuse across processes on most systems. For security reasons, if process X has opened socketA and now socketA is in TIME_WAIT state, then process X can bind socketB to the same address and port as socketA, if, and only if it uses SO_REUSEADDR. But process Y cannot bind a socket to the same address and port as socketA, not even if it uses SO_REUSEADDR, as the socket in TIME_WAIT state does not belong to process Y.

So if your process dies and you restart it, this is a new process to the system and it will not allow reuse of sockets in TIME_WAIT state from your previous process, despite being the same program, as all that the system usually remembers is the process ID and this ID will be different.

So what could you possibly do to work around the issue? SO_REUSEPORT is an option, as it has no restriction to "being the same process", since it has explicitly been introduced to Linux to allow port re-use by different processes.

Another possibility is to catch any termination of your program (as much as that is possible), close the read pipe of your socket (shutdown() with SHUT_RD) and then wait a bit until the other end tries to send another data packet to you. This send should then fail with a send error, hopefully causing the other side to close the "broken socket" and if you then continue termination, it was the other side that closed your send pipe and thus you will not end up in TIME_WAIT. Of course, pulling this off is tricky and maybe impossible inside a signal handler that is called because your app has crashed, as what you can do in a signal handler is very limited. Usually you work around this by handling signals outside of the handler but if that was a crash signal, I'm afraid this isn't going work. Also note that you cannot catch SIGKILL and even when killed like this, the system will cleanly close your sockets.

A nice pro grammatical work-around: Make two processes. One parent process, which does all the socket management and that spawns a child process that then deals with the actual server implementation. If the child process is killed, the parent process still owns all sockets, can still close them cleanly, can re-bind to the same address and port using SO_REUSEADDR and it can even spawn a new child process, so your server continues running.