I am working on university project, I have to build a UDP client-server private chat. I worked on it for some time and at this point works fine when I do local testing in my computer with multiple instances of the Client but problems start to raise when i try it outside it, when working for example with my PC as Server and Client and then my friend from his PC as Client. Things seems to go well for the fist couple of messages exchanged but then all of sudden the Server halts it's execution and goes in what I suppose to be deadlock state, I don't know if this happens or can happen having multiple requests over the same socket causing a race condition but I would hear some suggestions from you because I don't really know how to fix it, I thought maybe going multi-threaded Server maybe will fix the issue but don't know if it is the case. I will leave you my Git repo with all the code for the server and the client, I know a should provide a small reproducible example of code but not knowing exactly where to the issue begins i can't
git repo: https://gitlab.com/antonio_ciprani/so-progetto-20_21
Hope to find some answers
CodePudding user response:
You have a race condition between clients and the ordering of the messages received from multiple clients by the server.
A given client does:
- Sends a message with
op
in it (4 bytes) - Depending upon
op
, it sends a "payload" message for the given operation. - The payload is of varying type and length, depending upon the value of
op
The problem: Because the requests are sent in two messages rather than a single atomic request, there is a race between clients:
- Client A sends op 1
- Client B sends op 2
- Client A sends op 1 payload
- Client B sends op 2 payload
Here, when the server receives (1), it tries to get (3) but will receive (2) instead. That is, it mistakes client B's op
message as the payload for client A's payload.
There are a few remedies:
- Clients send a single atomic message that has both
op
and the associated payload - Server keeps track of client "state" based on client's [sender's] IP address from the message.
To send a single message, always send a unified message structure (e.g.):
typedef struct {
int op;
union {
User user;
Message msg;
};
} Transfer;
The length given to the recvfrom
should always be sizeof(Transfer)
The length of the message sent can be:
- Always use
sizeof(Transfer)
- Sender can send just the length of
op
length of the givenstruct
for each payload type
To implement (2), the sender can do (e.g.):
Transfer tran;
size_t len;
tran.op = op;
switch (op) {
case 1:
tran.msg = ...;
len = sizeof(int) sizeof(Message);
break;
case 2:
tran.user = ...;
len = sizeof(int) sizeof(User);
break;
}
sendto(sockfd,&tran,len,...);
Note that due to possible alignment/padding issues in the Transfer
struct (e.g. offset of the union
), the length calculation may need to be more sophisticated, but the above is a good start.
UPDATE:
I deleted precedent comment after learning more about unions, shouldn't I initialize len to sizeof(int) the highest value between sizeof(User) and sizeof(Message)? – Antonio Ciprani
When receiving always use sizeof(Transfer)
. The receiver doesn't know the length before the message arrives. So, it must use the maximum possible length. This does no harm because the actual length received (i.e. the length that was sent) will be the return value of the recvfrom
. This works regardless of whether the sender sends a maximum length or an exact length.
When sending, using the highest sizeof
value of User
or Message
is equivalent to using a send length of sizeof(Transfer)
except for the alignment problem I mentioned above [the fix for the alignment is below].
It is okay to always send a fixed length "maximum" size message. This will always work. If the payload length are fairly close in size, this may be "good enough".
But, if the message payload can vary in size, it is sometimes desirable to be able to send just the exact number of bytes. That is, (e.g.) if the payload could be 8 bytes or 256 bytes, sending the full amount could be wasteful/inefficient. The above code remedies that by sending just the correct amount.
also to send "messages" to other online User I have to somehow make the Server send the list with the online users to a specific Client but if I make Clients send a single atomic message that has both op and the associated payload I should have the Client request it before sending the single atomic message and that i think will cause again the issue you stated in your replay maybe I should focus on your (2) remedy instead – Antonio Ciprani
In (2), using per-client "state" to solve the race condition can be done. But, the better fix is the unified message structure. This remedy should be done regardless.
However, maintaining "state" (by either server OR client) is useful as part of the protocol itself.
You probably need both methods (for different reasons).
What you're describing is a "request/response" protocol. The client sends a "request" for some data that the server has. The server then sends that data.
The "response" may be too large to fit in a single message and may need to be split across several messages. Each side must remember to send/receive the multiple data messages. Hence, each side must maintain "state" for the connection
Note that this works in reverse as well. The client sends a message that states it wants to send data to the server. It then sends the data (in several messages).
An example of this would be file transfer. The client sends a message with a filename that it wants the server to send to it. The server responds with a series of messages with the file data.
Both sides must remember the state they are in:
- name of the file being transferred
- direction of the file transfer
- offset within the file being transferred.
The file transfer is complete when a message is sent that has a zero payload length.
Here is some partial updated code:
#include <stdio.h>
#include <string.h>
#include <stddef.h>
#include <unistd.h>
#include <arpa/inet.h>
enum {
TYPE_EOF,
TYPE_USER,
TYPE_MSG,
};
#define MAXLENGTH 50
#define MSGLENGTH 256
typedef struct User {
char username[MAXLENGTH];
char password[MAXLENGTH];
int connected;
struct sockaddr_in cliaddr;
} User;
typedef struct Msg {
char sender[MAXLENGTH];
char reciver[MAXLENGTH];
char data[MSGLENGTH];
} Message;
// any type of payload
typedef union Payload {
User user;
Message msg;
} Payload;
// message to send/recv
typedef struct {
int op;
Payload payload;
} Transfer;
int sockfd;
struct sockaddr_in sendaddr;
typedef struct state State;
struct state {
State *next;
struct sockaddr_in addr;
socklen_t alen;
int last_op;
FILE *fi;
char file[1024];
};
State *state_list;
State *
state_locate(const struct sockaddr *addr,socklen_t alen)
{
State *state = state_list;
for (; state != NULL; state = state->next) {
if (state->alen != alen)
continue;
if (memcmp(&state->addr,addr,alen) == 0)
break;
}
return state;
}
void
send_something(int op)
{
Transfer tran;
size_t len;
tran.op = op;
len = offsetof(Transfer,payload);
switch (op) {
case TYPE_MSG:
//tran.payload.msg = ...;
len = sizeof(Message);
break;
case TYPE_USER:
//tran.payload.user = ...;
len = sizeof(User);
break;
default:
len = 0;
break;
}
sendto(sockfd,&tran,len,MSG_WAITALL,
(const struct sockaddr *) &sendaddr,sizeof(sendaddr));
}
void
recv_something(void)
{
//struct sockaddr_in remaddr;
struct sockaddr remaddr;
socklen_t addrlen;
Transfer tran;
ssize_t len;
len = recvfrom(sockfd,&tran,sizeof(tran),MSG_WAITALL,
&remaddr,&addrlen);
if (len < 0)
perror("recvfrom");
// get length of payload
len -= offsetof(Transfer,payload);
// locate the current state
State *state = state_locate(&remaddr,addrlen);
switch (tran.op) {
case TYPE_MSG:
if (len != sizeof(Message))
fprintf(stderr,"recv_something: error -- len=%zd Message=%zu\n",
len,sizeof(Message));
// do something with tran.payload.msg ...
break;
case TYPE_USER:
if (len != sizeof(User))
fprintf(stderr,"recv_something: error -- len=%zd User=%zu\n",
len,sizeof(User));
// do something with tran.payload.user ...
break;
default:
fprintf(stderr,"recv_something: unknown command -- op=%d\n",tran.op);
break;
}
state->last_op = tran.op;
}