Here's a simple reproducer:
cat >sample_pipeline.sh << EOF
set -x
head /dev/urandom | awk '{\$2=\$3=""; print \$0}' | column -t | grep 123 | wc -l >/dev/null
EOF
watch -g -d -n 0.1 'bash sample_pipeline.sh 2>&1 | tee -a /tmp/watchout'
wc -l /tmp/watchout
tail /tmp/watchout
As expected, usually the commands execute in the order they are written in:
head /dev/urandom
awk '{$2=$3=""; print $0}'
column -t
grep 123
wc -l
...but some of the time the order is different, e.g. awk
before head
:
awk '{$2=$3=""; print $0}'
head /dev/urandom
column -t
grep 123
wc -l
I can understand if the shell pre-spawns processes waiting for input, but why wouldn't it spawn them in order?
Replacing bash
with dash
(the default Ubuntu shell) seems to have the same behavior.
CodePudding user response:
When you write a pipeline like that, the shell will fork off a bunch of child processes to run all of the subcommands. Each child process will then print the command it is executing just before it calls exec. Since each child is an independent process, the OS might schecdule them in any order, and various things going on (cache misses/thrashing between CPU cores) might delay some children and not others. So the order the messages come out is unpredictable.
CodePudding user response:
This happens because of how pipelines are implement. The shell will first fork N subshells, and each subshell will (when scheduled) print out its xtrace output and invoke its command. The order of the output is therefore the result of a race.
You can see a similar effect with
for i in {1..5}; do echo "$i" & done
Even though each echo
command is spawned in order, the output may still be 3 4 5 2 1
.