I'm looking for the "leanest" way to transmit a C structure between two network nodes.
I've read articles on StackOverflow which outline a few ways, which people seem to disagree with each other quite a bit on.
What I'd like to do is have a header file which defines structures with fixed width fields. This structure will be "packed" to remove all padding.
I'd like to send this structure directly to the write() or sendto() functions. Have those functions read directly from memory and send it over the wire. I'm not worried with framing for this example, since the receiver will only receive this one message and can compute the packed size from the header.
When the receiver receives this message by reading the necessary n bytes from its socket, I want it to copy the data to some local memory and cast that memory directly to the packed structure type.
So this all seems good, however I think the issue that people touch on with this is endianess.
If I understand correctly, the usage of the "hton*" and "ntoh" functions can help solve the endianess issue here.
If I'm correct, I can see several ways to solve this. Lets say our packed structure looks like this.
struct P {
uint8_t a
uint8_t b
}
- During construction of P, write a in newtork order, then write b in network order
- After construction of P, pass a pointer to it, to "htons", which should place the structure in network order.
- Do the same as above, but only before handing the struct to the write or sendto functions
Is my thinking correct here? And if so, what are the preferred handling of when to swap your host byte order to network byte order. Or is it case by case basis depending if you need to use the structure in your code before sending.
CodePudding user response:
The endianess issue would not apply to the struct as a whole but to the individual fields within the struct. Given your example:
struct P {
uint8_t a;
uint8_t b;
};
Each field in a single byte, so there's no need to worry about endianness. The bytes of the struct are (in order), the single byte of field a
followed by the single byte of field b
.
If on the other had you had this:
struct Q {
uint16_t a;
uint16_t b;
};
Then you would need to convert the values. On the sending side:
struct Q q_send;
q_send.a = htons(value1);
q_send.b = htons(value2);
sendto(sock, &q_send, sizeof q_send, 0, &dest, destlen);
On the receiving side:
struct Q q_recv;
void *buf = malloc(sizeof q_recv);
recvfrom(sock, buf, sizeof q_recv, 0, &sender, &senderlen);
memcpy(&q_recv, buf, sizeof q_recv);
value1 = ntohs(q_recv.a);
value2 = ntohs(q_recv.b);
With regard to packed structures, the best way to handle this is to design your protocol (and by extension the struct(s) it maps to), so that there is no padding.
So for example instead of this:
struct S {
uint8_t a;
uint32_t b;
uint16_t c;
};
Which could have 5 bytes of padding, use this:
struct S {
uint32_t b;
uint8_t a;
uint8_t reserved1;
uint16_t c;
};
Which has none, along with a specific byte set aside for future use.
And to be extra sure:
static_assert(sizeof(struct S) == 8, "invalid struct size");
For more information, The Lost Art of Structure Packing goes into this in great detail.
CodePudding user response:
I'm looking for the "leanest" way to transmit a C structure between two network nodes.
Ok.
What I'd like to do is have a header file which defines structures with fixed width fields. This structure will be "packed" to remove all padding.
Note well that the C language does not provide any standard way to do that. Most compilers do provide mechanisms for it, but they differ in the details.
I'd like to send this structure directly to the write() or sendto() functions. Have those functions read directly from memory and send it over the wire.
Ok. That's pretty normal.
I'm not worried with framing for this example, since the receiver will only receive this one message and can compute the packed size from the header.
I think you're introducing details ("packed size") about some kind of payload that follows the data represented by your header structure. I'm going to ignore that whole sentence as irrelevant to the actual question about transmitting the contents of your structure.
When the receiver receives this message by reading the necessary n bytes from its socket, I want it to copy the data to some local memory and cast that memory directly to the packed structure type.
It is possible to interpret the received data as the representation of an instance of your structure. That sort of thing is pretty common. But it would not proceed via casting to a structure type, nor, if you want well-defined behavior, via converting a pointer to something else (a char *
, say) to type my_header_structure *
.
So this all seems good, however I think the issue that people touch on with this is endianess.
Endianess is certainly an important consideration if you want machines with different native endianness to be able to communicate. If you're not careful, then the sizes of your data types is another, related one. So are alignment requirements of various data types, which is an area where your structure packing and direct interpretation of received data can get you in trouble.
If I understand correctly, the usage of the "hton*" and "ntoh" functions can help solve the endianess issue here.
Yes, that is their purpose. But do note that they are standardized by POSIX, not by C, and although they are widely available even on non-POSIX systems, they are not necessarily available on every system.
If I'm correct, I can see several ways to solve this. Lets say our packed structure looks like this.
struct P { uint8_t a uint8_t b }
uint8_t
is an 8-bit type (on machines that provide it). There are no endianess considerations with it, at least not at the C level.
Perhaps you meant uint64_t
, a 64-bit type (on machines that provide it), but do note that this is not the type that nothl()
and htonl()
work with. There are no standard functions for converting these to and from network byte order.
Anyway,
- During construction of P, write a in newtork order, then write b in network order
- After construction of P, pass a pointer to it, to "htons", which should place the structure in network order.
- Do the same as above, but only before handing the struct to the write or sendto functions
I take the distinction here not to be what is done but rather when it is done. As long as you do it before actually sending the data, you can expect the same data to be sent.
However, the htons()
function itself would not be involved here unless possibly if some of the structure members were of type uint16_t
or maybe int16_t
. I think what you are talking about can be generalized as "convert the members as necessary from native byte order to network byte order".
Is my thinking correct here?
More or less.
And if so, what are the preferred handling of when to swap your host byte order to network byte order. Or is it case by case basis depending if you need to use the structure in your code before sending.
If you're worrying about endianess at all, then you cannot assume that a structure prepared for transmission over the wire is suitable to use for anything much else. As long as you don't use such a structure for anything else, it doesn't matter when you prepare it.
But since you need to maintain a distinction between instances that are ready for transmission and those that are not, you will save yourself trouble, heartache, and many debugging hours by using a dedicated, opaque-ish data type for header data that is prepped for sending. The implication is that an object of that type is not in a valid state if it is not in network byte order. And really, using a structure type for that doesn't provide much advantage. You would be as well served by a plain array of uint8_t
.
OVERALL I would maintain a clean separation between the data types you use for application logic and those you user for communication over the network. You probably don't have much practical use for structure types on the network-communication side, because endianess and packing issues offset and perhaps even moot any advantages from referring to fields by name. You can still prepare your header data as a single monolithic chunk prior to sending it, if you like, though I'm not sure that provides any advantage in practice.