I am trying to have a client socket make a connection to a server with a timeout.
In order to achieve the timeout, I am using a select
call with the the ts
set to 30s
:
int flags = 0, error = 0, ret = 0;
fd_set rset, wset;
socklen_t len = sizeof(error);
struct timeval ts;
ts.tv_sec = 0;
ts.tv_usec = mConnectTimeoutMs * 1000; // this is 30s
// clear out descriptor sets for select
// add socket to the descriptor sets
FD_ZERO(&rset);
FD_SET(sock, &rset);
wset = rset;
// set socket nonblocking flag
if ((flags = fcntl(sock, F_GETFL, 0)) < 0) {
return -1;
}
if (fcntl(sock, F_SETFL, flags | O_NONBLOCK) < 0) {
return -1;
}
// initiate non-blocking connect
if ((ret = ::connect(sock, sa, size)) < 0)
if (errno != EINPROGRESS) {
return -1;
}
if (ret == 0) // then connect succeeded right away
{
// put socket back in blocking mode
if (fcntl(sock, F_SETFL, flags) < 0) {
return -1;
}
return 0;
}
// we are waiting for connect to complete now
if ((ret = select(sock 1, &rset, &wset, NULL, &ts)) < 0) {
return -1;
}
if (ret == 0) { // we had a timeout
errno = ETIMEDOUT;
return -1;
}
// we had a positive return so a descriptor is ready
if (FD_ISSET(sock, &rset) || FD_ISSET(sock, &wset)) {
if (getsockopt(sock, SOL_SOCKET, SO_ERROR, &error, &len) < 0) {
return -1;
}
} else {
return -1;
}
if (error) { // check if we had a socket error
errno = error; // this always returns 111
return -1;
}
The point of the timeout is to allow time for the server to spawn & the server socket to be listening/accepting.
For some reason, without the server running, the select
call falls through immediatly, with the rset
and wset
both returning true from FD_ISSET(sock
.
Calling:
getsockopt(sock, SOL_SOCKET, SO_ERROR, &error, &len)
Always results in the error being populated with error code 111
(connection refused), which is expected, since the server is not running yet. What am i doing wrong here If I want the select to wait for the socket to be ready to actually connect? Or how can I correctly "wait for the server socket to exist to connect" using a timeout?
CodePudding user response:
Per @Barmar's comments, the select falls through as a result of the RST when the server socket is not yet listening, and the resulting socket will have an error (ECONNREFUSED). To achieve the timeout as intended, we can wrap the existing logic in a do/while loop, and then modify the timeout value to by dynamic based on remaining time in the timeout:
#include <chrono>
#include <thread>
...
int timeoutRemaining = mConnectTimeoutMs;
std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now();
do {
// same conn logic as before, except:
...
ts.tv_usec = timeoutRemaining * 1000;
...
if (error) { // check if we had a socket error
errno = error;
if (errno == ECONNREFUSED) {
close(sock); // can't call connect on a socket thats refused connection
sock = create_new_sock();
// artificially throttle connection requests
std::this_thread::sleep_for(std::chrono::seconds(1));
continue; // there is no server available, continue trying until we reach our connection timeout
}
return -1;
}
...
} while ((timeoutRemaining = (mConnectTimeoutMs
- (std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::steady_clock::now() - start)
.count())))
> 0);