Home > OS >  How to encode the length of the string at the front of the string
How to encode the length of the string at the front of the string

Time:01-27

Imaging I want to design a protocol, so I want to send a packet from the client side to the server side, I need to encode my data.

I have a string, I want to add the length of the string at the front of the string, for example:

string: "my"
which length is 2
So what I expect is to create a char[] in c and store | 2 | my | in the buffer

In this way, after the server receives the packet, it will know how many bytes need to be read for this request. (by using C programming)

I tried to do it but I don't know how to control the empty between the length and the string, I can create a buffer which size is 10, and use sprintf() to convert the length of the string and add it into the buffer.

CodePudding user response:

One poor way to do it is to encode the length in ASCII at the front the string - the down side is you’ll need variable char elements to store the length if you ever want to send anything longer than 9 chars.

A better way to encode the strings length, since you are designing your own protocol, is to allocate a fixed number of bytes at the beginning, say 8 bytes, and cast &char[0] as a pointer to an uint64_t Basically, use array[0~7] to store an 8byte unsigned long. Align the address w.r.t. 8byte boundary for (slightly) better performance.

If the sender and receiver machine have different endianness, you’ll also have to include a multi-byte long “magic number” at the head of the char array. This is necessary for both sides to correctly recover the string length from the multi-byte-long length field.

CodePudding user response:

There are two standards used in C:

  1. str*: char * which is terminated with a '\0'.
  2. mem*, read/write: void * plus a length size_t. It's the same idea for readv() and writev() but here the two variables is bundled into an array of struct iovec.

If you use anything else it's automatically a learning curve for whoever needs to read or interact with your code. I wouldn't do that trade-off, but you do you.

You can, of course, encode the length into the char * but now you have to think about how you encode it (big vs little endian), fixed vs variable size.

You might be interested in SDS which hides the length. This way only have to reimplement the functions that change the length of the string instead of all string functions. Use an existing library.

  • Related