I followed a tutorial on how to make two processes on Linux communicate using the Linux Sockets API, and that's the code it showed to make it happen:
Connecting code:
char* socket_path = "\0hidden";
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
struct sockaddr_un addr;
memset(&addr, 0x0, sizeof(addr));
addr.sun_family = AF_UNIX;
*addr.sun_path = '\0';
strncpy(addr.sun_path 1, socket_path 1, sizeof(addr.sun_path)-2);
connect(fd, (struct sockaddr*)&addr, sizeof(addr));
Listening code:
char* socket_path = "\0hidden";
struct sockaddr_un addr;
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
memset(&addr, 0x0, sizeof(addr));
addr.sun_family = AF_UNIX;
*addr.sun_path = '\0';
strncpy(addr.sun_path 1, socket_path 1, sizeof(addr.sun_path)-2);
bind(fd, (struct sockaddr*)&addr, sizeof(addr));
listen(fd, 5);
Basically, I have written a web server for a website in C, and a database management system in C , and making them communicate (after a user's browser sends an HTTP request to my web server, which it's listening for using an AF_INET family socket, but that's not important here, just some context) using this mechanism.The database system is listening with its socket, and the web server connects to it using its own socket. It's been working perfectly fine.
However, I never understood what the purpose of a null byte at the beginning of the socket path is. Like, what the heck does "\0hidden"
mean, or what does it do? I read the manpage on sockets, it says something about virtual sockets, but it's too technical for me to get what's going on. I also don't have a clear understanding of the concept of representing sockets as files with file descriptors. I don't understand the role of the strncpy()
either. I don't even understand how the web server finds the database system with this code block, is it because their processes were both started from executables in the same directory, or is it because the database system is the only process on the entire system listening on an AF_UNIX socket, or what?
If someone could explain this piece of the Linux Sockets API that has been mystifying me for so long, I'd be really grateful. I've googled and looked at multiple places, and everyone simply seems to be using "\0hidden"
without ever explaining it, as if it's some basic thing that everyone should know. Like, am I missing some piece of theory here or what? Massive thanks to anybody explaining in advance!
CodePudding user response:
This is specific to the Linux kernel implementation of the AF_UNIX
local sockets. If the character array which gives a socket name is an empty string, then the name doesn't refer to anything in the filesystem namespace; the remaining bytes of the character array are treated as an internal name sitting in the kernel's memory. Note that this name is not null-terminated; all bytes in the character array are significant, regardless of their value. (Therefore it is a good thing that your example program is doing a memset
of the structure to zero bytes before copying in the name.)
This allows applications to have named socket rendezvous points that are not occupying nodes in the filesystem, and are therefore are more similar to TCP or UDP port numbers (which also don't sit in the file system). These rendezvous points disappear automatically when all sockets referencing them are closed.
Nodes in the file system have some disadvantages. Creating and accessing them requires a storage device. To prevent that, they can be created in a temporary filesystem that exists in RAM like tmpfs
in Linux; but tmpfs
entries are almost certainly slower to access and take more RAM than a specialized entry in the AF_UNIX
implementation. Sockets that are needed temporarily (e.g. while an application is running) may stay around if the application crashes, needing external intervention to clean them up.
hidden
is probably not a good name for a socket; programs should take advantage of the space and use something quasi-guaranteed not to clash with anyone else. The name allows over 100 characters, so it's probably a good idea to use some sort of UUID string.
The Linux Programmer's Manual man
page calls this kind of address "abstract". It is distinct and different from "unnamed".
Any standard AF_UNIX
implementation provides "unnamed" sockets which can be created in two ways: any AF_UNIX
socket that has been created with socket
but not given an address with bind
is unamed; and the pair of sockets created by socketpair
are unnamed.
For more information, see
man 7 unix
in some GNU/Linux distro that has the Linux Man Pages installed.
CodePudding user response:
\0
just puts a NUL
character into the string. As a NUL
characters is used to terminate a string, to all C string functions socket_path
looks like an empty string, while in fact it is not but they would stop processing it after the first character.
So im memory socket_path
actually looks like this:
char socket_path[] = { `\0`, `h`, `i`, `d`, `d`, `e`, `n`, `\0` };
As all strings automatically get a terminating NUL
attached.
The line
strncpy(addr.sun_path 1, socket_path 1, sizeof(addr.sun_path)-2);
copies the bytes of socket_path
to the socket address structure addr
, yet skipping the first (NUL
) byte as well as the last one (also NUL
). Thus the address of the socket effectively is just the word "hidden"
.
But as the first byte is left out from the addr.sun_path
as well and this byte has been initialized to NUL
by memset
before, the actual path is still \0hidden
.
So why would anyone do that? Probably to hide the socket, as normally systems show UNIX sockets in the file system as actual path entries but no file system I'm aware of can handle the \0
character. So if the name has a \0
character, it won't appear in the file system, yet such a characters is only allowed as the very first characters, otherwise the system would still try to create that path entry and fail and thus the socket creating would fail. Only as the first characters, the system will not even try to create it, which means you cannot see that socket by just calling ls
in terminal and whoever wants to connect to it needs to know the name.
Note that this is not POSIX conform, as POSIX expects UNIX sockets to always appear in the file system and thus only characters that are legal for the file system in use are allowed as socket name. This will only work on Linux.