Printing from bash into a modified output file-CodePudding

I have the following script, check.sh:

#!/bin/bash
counter=1
while [ $counter -gt 0 ]
do
    echo "hello"
    sleep 1
done

check.sh is meant to run forever. Let's say that I run the code from terminal writing it into a file:

./check.sh > output.txt

Every second I will have the world hello printed in my file. Now, let us open output.txt, make some modifications and overwrite. The executable check.sh is still running, but it is no longer printing in output.txt. Where is it printing? Is there a way to recover the output?

CodePudding user response：

When you open the file in an editor it's possible that the editor is removing the original file and creates and writes contents to a new file, especially when the original file was modified while it was opened in the editor. You can check if file was replaced with a new file with the same name by using ls -i option that prints file's inode - check file's inode before editing it in an editor:

$ ls -i output.txt
3541644 output.txt

and after editing it in an editor:

$ ls -i output.txt
3541637 output.txt

(numbers will be different on your system).

Where is it printing?

Shell that runs check.sh script and handled redirection prints output to a file that has been unlinked, see this answer for more technical details.

Is there a way to recover the output?

You didn't specify what OS you're using but on Linux you can do tail -f /proc/<PID>/fd/<NUMBER>. First find PID (process identifier) of check.sh script:

$ ps aux | grep '[c]heck.sh'
ja        7476  0.0  0.0   7100  3804 pts/8    S    14:29   0:00 /bin/bash ./check.sh`

and then print contents of file descriptor number 1 (0 - standard input, 1 - standard output, 2 - standard error):

$ tail -f /proc/7476/fd/1

And notice that check.sh output would still be redirected to output.txt if you truncated it without removing it, for example by doing:

echo new line > output.txt

If you used GNU Coreutils tail it would just say:

tail: output.txt: file truncated

and keep printing appended lines.

CodePudding user response：

Consider first what happens when you launch the script with:

./check.sh > output.txt

. The shell opens output.txt for writing, forks a child process (not necessarily in that order), connects the child's standard output to the open file description, and execs the script. If necessary, the parent closes the file descriptor it opened.

Note well that the child does not reopen the file for each write, and it does not even know the path (if any) to the file connected to its standard output.

Now, let us open output.txt, make some modifications and overwrite. The executable check.sh is still running, but it is no longer printing in output.txt. Where is it printing? Is there a way to recover the output?

That depends a bit on the details of "overwrite", but what a text editor typically will do is create a new file, write the modified content there, and replace the original file with the new one. At this point, it is necessary to understand that "replace" means that the directory entry pointing to the original file is replaced with one pointing to the new file. The original file is not removed from disk as long as any (hard) link appears in any directory or any process has it open.

Now we can understand that the script is still printing to the original output file, but that file is (evidently) no longer accessible at its original path. It may or may not be accessible as a file at any path, but typically it would not be. On some operating systems, such as Linux, the script's standard output might be accessible as an entry in a special filesystem as long as the script continues to run, but it may be write-only, such that even in that case, it is not possible to recover any data that have been written there.

Chances are that there are nevertheless ways to discover the underlying physical file and access its contents, but that would involve special tools.