Home > Back-end >  Understanding file descriptor duplication in bash
Understanding file descriptor duplication in bash

Time:07-29

I'm having a hard time understanding something about redirections in bash.

I'll start with what I know:

Each process has file descriptors opened which it can write to/read from. These file descriptors may represent files on disk, terminals, devices, etc.

When we start teminal with bash, we have file stdin (0) stdout (1) and stderr (2) opened, pointing to the terminal. Whenever we run a command (a new process), that process inherits the file descriptors of its parent (bash), so by default, it will print stdout and stderr messages to the terminal, and read from terminal also.

When we redirect, for example:

$ ls 1>filelist

We're actually changing file descriptor 1 of the ls process, to point to the filelist file, instead of the terminal. So when ls will write(1, ...) it will go to the file.

So to sum it up, a redirection is basically changing the file to which the file descriptor to which the program writes/reads to/from refers to.

Now, let's say I have the following C program:

#include <stdio.h>
#include <fcntl.h>

int main() 
{
    int fd = 0;

    fd = open("info.log", O_CREAT | O_RDWR);
    printf("%d", fd);

    write(fd, "INFO::", 6);
    return 0;
}

This program opens a file info.log, which is referred to by a file descriptor (usually 3). Indeed, if I now compile this program and run it:

$ ./app
3

It creates the file info.log which contains the "INFO::" text in it.

But here's what I don't get: according to the logic described above, if I now redirect FD 3 to another file:

$ ./app 3> another_file

The text should be written to this other file, but for some reason, it doesn't.

Can someone explain?

CodePudding user response:

Hint: when you run ./app 3> another_file, it'll print "4" instead of "3".

More detailed explanation: when you run ./app 3> another_file in the shell, a series of things happens:

  1. The shell fork()s a subprocess that'll run ./app. The subprocess is basically a clone of its parent process so, it'll still be running the shell program.
  2. In that subprocess, the shell opens "another_file" on file descriptor #3 for writing.
  3. Then it uses one of the execl() family of calls to execute the ./app binary (with "another_file" still open on FD#3).
  4. The program runs open("info.log", O_CREAT | O_RDWR), which creates "info.log" and opens it on the next available file descriptor. Since FD#3 is already in use, that's FD#4.
  5. The program writes "INFO::" to FD#4, which is "info.log".

Since open() uses a new FD, it's not really affected by any active redirects. And actually, if the program did open something on FD#3, that'd replace the connection to "another_file" with whatever it had opened instead, essentially overriding the redirect.

If the program wanted to use the redirect, it'd have to write to FD#3 without first opening anything on it. This is what's normally done with FD#1 and 2 (standard output and error), and that's why redirecting those works.

  • Related