Why does xtrace show piped commands executing out of order?-CodePudding

Here's a simple reproducer:

cat >sample_pipeline.sh << EOF                                                                                 
set -x
head /dev/urandom | awk '{\$2=\$3=""; print \$0}' | column -t | grep 123 | wc -l >/dev/null
EOF
watch -g -d -n 0.1 'bash sample_pipeline.sh 2>&1 | tee -a /tmp/watchout'
wc -l /tmp/watchout
tail /tmp/watchout

As expected, usually the commands execute in the order they are written in:

  head /dev/urandom
  awk '{$2=$3=""; print $0}'
  column -t
  grep 123
  wc -l

...but some of the time the order is different, e.g. awk before head:

  awk '{$2=$3=""; print $0}'
  head /dev/urandom
  column -t
  grep 123
  wc -l

I can understand if the shell pre-spawns processes waiting for input, but why wouldn't it spawn them in order?

Replacing bash with dash (the default Ubuntu shell) seems to have the same behavior.

CodePudding user response：

When you write a pipeline like that, the shell will fork off a bunch of child processes to run all of the subcommands. Each child process will then print the command it is executing just before it calls exec. Since each child is an independent process, the OS might schecdule them in any order, and various things going on (cache misses/thrashing between CPU cores) might delay some children and not others. So the order the messages come out is unpredictable.

CodePudding user response：

This happens because of how pipelines are implement. The shell will first fork N subshells, and each subshell will (when scheduled) print out its xtrace output and invoke its command. The order of the output is therefore the result of a race.

You can see a similar effect with

for i in {1..5}; do echo "$i" & done

Even though each echo command is spawned in order, the output may still be 3 4 5 2 1.