Home > Software engineering >  bash redirection to file adds unexpected 0A bytes
bash redirection to file adds unexpected 0A bytes

Time:11-14

I thought that if I redirected the output of ls to a file, the exact same sequence of characters that would otherwise have been sent to the console would get written to that file.

To test that, I create 3 files and then list them

$ touch a b c
$ ls
a  b  c

I now run ls again, this time redirecting to a file which I cat

$ ls > out
$ cat out
a
b
c
out

Unexpectedly, there is an 0A linefeed character between each filename in out

$ xxd out
00000000: 610a 620a 630a 6f75 740a                 a.b.c.out.

Piping the output of ls to xxd

$ ls | xxd
00000000: 610a 620a 630a 6f75 740a                 a.b.c.out.

the linefeeds are still present.

How did the 0A bytes get there? Does ls behave differently if it's being redirected or maybe the shell ignores linefeeds in certain circumstances?

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.3 LTS
Release:    20.04
Codename:   focal

CodePudding user response:

Yes, ls behaves differently if it is being redirected. You can get the output you expect with -x:

$ mkdir /tmp/t
$ cd /tmp/t
$ touch a b c
$ ls | cat
a
b
c
$ ls -x | cat
a b c
$ ls --format=single-column
a
b
c

@GordonDavisson points us to the POSIX spec for ls, which reads

The default format shall be to list one entry per line to standard output; the exceptions are to terminals or when one of the -C, -m, or -x options is specified. If the output is to a terminal, the format is implementation-defined.

Thus, at any rate in POSIX, it's the linewise output which is the 'norm'; the terminal output can be anything (although I've never seen anything except space-separation). Presumably this is to make it possible to iterate the response linewise. I've also never noticed it, despite relying on it many times now I come to think about it!

Implementation

And here it is in the source of one ls implementation, checking explicitly:

    case LS_LS:
      /* This is for the `ls' program.  */
      if (isatty (STDOUT_FILENO))
        {
          format = many_per_line;
          /* See description of qmark_funny_chars, above.  */
          qmark_funny_chars = true;
        }
      else
        {
          format = one_per_line;
          qmark_funny_chars = false;
        }
      break;

source

Or in the current gnu coreutils:

  format = (0 <= format_opt ? format_opt
            : ls_mode == LS_LS ? (stdout_isatty ()
                                  ? many_per_line : one_per_line)
            : ls_mode == LS_MULTI_COL ? many_per_line
            : /* ls_mode == LS_LONG_FORMAT */ long_format);

where stdout_isatty is defined as in the previous example.

source

  • Related