So, I have this simple program in C that tries to make a HTTP request:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
void err(const char *msg) {
fprintf(stderr, "[ERROR] %s\n", msg);
exit(1);
}
int main(int argc,char *argv[])
{
char *host;
int port;
char *request;
host = "api.ipify.org";
port = 80;
request = "GET / HTTP/1.1\r\nHost: api.ipify.org\r\n\r\n";
struct hostent *server;
struct sockaddr_in serv_addr;
int sockfd, bytes, sent, received, total;
char response[4096];
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0) err("Couldn't open socket");
server = gethostbyname(host);
if (server == NULL) err("No such host");
memset(&serv_addr, 0, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
serv_addr.sin_port = htons(port);
memcpy(&serv_addr.sin_addr.s_addr, server->h_addr, server->h_length);
/* connect the socket */
if (connect(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) err("Couldn't connect");
/* send the request */
total = strlen(request);
sent = 0;
while (sent < total) {
bytes = write(sockfd, request sent, total - sent);
if (bytes < 0) err("Couldn't send request");
if (bytes == 0) break;
sent = bytes;
}
/* receive the response */
memset(response, 0, sizeof(response));
total = sizeof(response) - 1;
received = 0;
while (received < total) {
bytes = read(sockfd, response received, total - received);
if (bytes < 0) err("Couldn't receive response");
if (bytes == 0) break;
received = bytes;
}
/*
* if the number of received bytes is the total size of the
* array then we have run out of space to store the response
* and it hasn't all arrived yet - so that's a bad thing
*/
if (received == total) err("Couldn't store response");
/* close the socket */
close(sockfd);
/* process response */
printf("Response:\n%s\n",response);
return 0;
}
But, when I compile it and run it, it does what it is supposed to do, but it takes very long. Running it with the time
command reveals that it takes ~1m 0.3s to execute. If I comment out the read
function call, the execution time goes back to 0.3s. This means that for some reason, it is delaying my program by exactly 1 minute.
I've tried putting a printf
at the very start of the main function, but that also doesn't get called until 1 minute has passed.
Why is the entire main function delayed by one function and how can I fix this?
CodePudding user response:
First of all, you should include a field in the header like this:
Content-length: 0\r\n
It is true that the request is a GET
request, and it is also true that a GET
request doesn't have a body included. But this requirement is not mandatory, and you can send a body if you like (while servers are required to ignore it, for a GET
request) it is not forbidden to send an nonempty body for a get. You should send a Content-length
field.
Second, as you don't close the connection, neither shut it down on your side, and as you are using version 1.1
of HTTP
, the server is waiting for more content to come (a new request, or the body of the first GET
method). Read the RFC document RFC2616
for information about 1.1
connection model.
After changing the following lines
while (received < total) {
bytes = read(sockfd, response received, total - received);
if (bytes < 0) err("Couldn't receive response");
if (bytes == 0) break;
received = bytes;
}
into
while (received < total) {
bytes = read(sockfd, response received, total - received);
if (bytes < 0) err("Couldn't receive response");
if (bytes == 0) break;
write(1, response received, bytes); /* <<< this line added */
received = bytes;
}
to see what we are receiving, I was able to observe that the server sends you a short response:
$ http
HTTP/1.1 200 OK
Server: Cowboy
Connection: keep-alive
Content-Type: text/plain
Vary: Origin
Date: Mon, 21 Feb 2022 21:39:22 GMT
Content-Length: 14
Via: 1.1 vegur
82.181.193.234_ (<-- cursor remains there, as the last _ char)
and indeed remains waiting for the next request (as specified in HTTP/1.1
, for a keep alive connection) Finally, the server times out, and closes the connection.
You can switch to HTTP/1.0
, as keep alive connections are not supported by HTTP/1.0
, and will see how the connection is closed by the server immediately after the request. For a good interaction with the server, you should parse the headers of the response as you receive them, and then interpret the Content-length:
sent by the server (or the chunks, in case of a Content-encoding: chunked
content) to detect when the response is finished. The reason is the same as for the request. It is used by the other side in order to reuse the connection for the next request, and to detect where the request/response ends. In this case, the server sends a
Content-Length: 14\r\n
so when you receive the two consecutive CRLF
s, you should read 14 bytes and stop there, in order to continue reading from that point the response to the next request.