detecting connection state in epoll linux-CodePudding

There are many threads regarding how to detect if a socket is connected or not using various methods like getpeername / getsockopt w/ SO_ERROR. https://man7.org/linux/man-pages/man2/getpeername.2.html would be a good way for me to detect if a socket is connected or not. The problem is, it does not say anything about if the connection is in progress... So if i call connect, it is in progress, then i call getpeername, will it say it is an error (-1) even though the connection is still in progress?

If it does, I can implement a counter-like system that will eventually kill the socket if it is still in progress after x seconds.

CodePudding user response：

man connect

If the initiating socket is connection-mode, .... If the connection cannot be established immediately and O_NONBLOCK is not set for the file descriptor for the socket, connect() shall block for up to an unspecified timeout interval until the connection is established. If the timeout interval expires before the connection is established, connect() shall fail and the connection attempt shall be aborted.

If connect() is interrupted by a signal that is caught while blocked waiting to establish a connection, connect() shall fail and set errno to [EINTR], but the connection request shall not be aborted, and the connection shall be established asynchronously.

If the connection cannot be established immediately and O_NONBLOCK is set for the file descriptor for the socket, connect() shall fail and set errno to [EINPROGRESS], but the connection request shall not be aborted, and the connection shall be established asynchronously.

When the connection has been established asynchronously, select() and poll() shall indicate that the file descriptor for the socket is ready for writing.

If the socket is in blocking mode, connect will block while the connection is in progress. After connect returns, you'll know if a connection has been established (or not).

A signal could interrupt the (blocking/waiting) process, the connection routine will then switch to asynchronous mode.

If the socket is in non blocking mode (O_NONBLOCK) and the connection cannot be established immediately, connect will fail with the error EINPROGRESS and like above switching to asynchronous mode, that means, you'll have to use select or poll to figure out if the socket is ready for writing (indicates established connection).

CodePudding user response：

Short Answer

I think that, if getpeername() returns ENOTCONN, that simply means that the tcp connection request has not yet succeeded. For it to not return ENOTCONN, I think the client end needs to have received the syn ack from the server and sent its own ack, and the server end needs to have received the client's ack.

Thereafter all bets are off. The connection might subsequently be interrupted, but getpeername() has no way of knowing this has happened.

Long Answer

A lot of it depends on how fussy and short-term one wants to be about knowing if the connection is up.

Strictly Speaking...

Strictly speaking with maximum fussiness, one cannot know. In a packet switched network there is nothing in the network that knows (at any single point in time) for sure that there is a possible connection between peers. It's a "try it and see" thing.

This contrasts to a circuit switched network (e.g. a plain old telephone call), where there is a live circuit for exclusive use between peers (telephones); provided current is flowing, you know the circuit is complete even if the person at the other end of the phone call is silent.

Note that if the two computers were connected by a single Ethernet cable (no router, no switches, just a cable between NICs), that is effectively a fixed circuit (not even a circuit-switched network).

Relaxing a Little...

Focusing on what one can know about a connection in a packet switched network. As other have already said, the answer is that, really, one has to send and receive packets constantly to know if the network can still connect the two peers.

Such an exchange of packets occurs with a tcp socket connect() - the connecting peer sends a special packet to say "please can I connect to you", and the serving peer replies "yes", the client then says "thank you!" (syn->, <-syn ack, ack->). But thereafter the packets flow between peers only if the applications send and receive data, or elects to close the connection (fin).

Calling something like getpeername() I think is somewhat misleading, depending on your requirements. It's fine, if you trust the network infrastructure and remote computer and its application to not break, and not crash.

It's possible for the connect() to succeed, then something breaks somewhere in the network (e.g. the peer's network connection is unplugged, or the peer crashes), and there is no knowledge at your end of the network that that has happened.

The first thing you can know about it is if you send some traffic and fail to get a response. The response is, initially, the tcp acks (which allows your network stack to clear out some of its buffers), and then possibly an actual message back from the peer application. If you keep sending data out into the void, the network will quite happily route packets as far as it can, but your tcp stack's buffers will fill up due to the lack of acks coming back from the peer. Eventually, your network socket blocks on a call to write(), because the local buffers are full.

Various Options...

If you're writing both applications (server and client), you can write the application to "ping pong" the connection periodically; just send a message that means nothing other than "tell me you heard this". Successful ping-ponging means that, at least within the last few seconds, the connection was OK.
Use a library like ZeroMQ. This library solves many issues with using network connections, and also includes (in modern version) socket heartbeats (i.e. a ping pong). It's neat, because ZeroMQ looks after the messy business of making, restoring and monitoring connections with a heartbeat, and can notify the application whenever the connection state changes. Again, you need to be writing both client and server applications, because ZeroMQ has it's own protocol on top of tcp that is not compatible with just a plain old socket. If you're interested in this approach, the words to look for in the API documentation is socket monitor and ZMQ_HEARTBEAT_IVL;
If, really, only one end needs to know the connection is still available, that can be accomplished by having the other end just sending out "pings". That might fit a situation where you're not writing the software at both ends. For example, a server application might be configured (rather than re-written) to stream out data regardless of whether the client wants it or not, and the client ignores most of it. However, the client knows that if it is receiving data it then also knows there is a connection. The server does not know (it's just blindly sending out data, up until its writes() eventually block), but may not need to know.

Ping ponging is also good in that it gives some indication of the performance of the network. If one end is expecting a pong within 5 seconds of sending a ping but doesn't get it, that indicates that all is not as expected (even if packets are eventually turning up).

This allows discrimination between networks that are usefully working, and networks that are delivering packets but too slowly to be useful. The latter is still technically "connected" and is probably represented as connected by other tests (e.g. calling getpeername()), but it may as well not be.

Limited Local Knowledge...

There is limited things one can do locally to a peer. A peer can know whether its connection to the network exists (e.g. the NIC reports a live connection), but that's about it.

My Opinion

Personally speaking, I default to ZeroMQ these days if at all possible. Even if it means a software re-write, that's not so bad as it seems. This is because one is generally replacing code such as connect() with zmq_connect(), and recv() with zmq_revc(), etc. There's often a lot of code removal too. ZeroMQ is message orientated, a tcp socket is stream orientated. Quite a lot of applications have to adapt tcp into a message orientation, and ZeroMQ replaces all the code that does that.

ZeroMQ is also well supported across numerous languages, either in bindings and / or re-implementations.