I would like to process data from a pipe inside a while loop. However, the data is separated by 3 forward slash characters rather than a newline, because I would like the script to be able to handle input that has newlines in it.
# The normal `read`-driven while loop
while read -r; do
echo "$REPLY"
done
Instead of read
, I would like to use sed
because it allows for more flexability with reading input. The problem I'm running into is that sed
eats all the data from the pipe on the first time through the while loop, so the loop only runs once.
Here is how I'm testing this
(note 1: uses fd
, a file finding utility)
# Make some files
mkdir test-{1,2}
touch 'test-1/file' $'test-2/file\nwith a newline'
# Find the files we just created
# note 2: there are some wide comments here ->>>
fd -t f . test-1/ test-2/ -Hx printf '%s///' '{}' | # List all files in test-1 and test-2, putting "///" at the end of each one
sed 's/./\n&/g' | # prepend each character with a newline so it can be read by `sed` one character at a time
while v="$(sed -n ':loop; N; s|\n\(.\)$|\1|; /\/\/\/$/!{b loop}; s|///$||; p;Q;')"; do
# ^^^^^^ sed script that prints everything up to the "///"
# that notates the end of the current path, then exits
# without processing more than the first one (so the
# rest can be processed in future loop iterations)
if [ -n "$v" ]; then
# print the file name
echo "$v"
else
# if v is empty
break
fi
done
The output of this is
test-1/file
... which indicates sed
is only running once, because the output should be this:
test-1/file
test-2/file
with a newline
Is there a way to get sed
to behave like read
so it can be used in a while
loop? Is there a magical property that allows read
to do this because it's a builtin?
CodePudding user response:
I will guess that you are asking XY question.
I would like the script to be able to handle input that has newlines in it.
Use a zero separated stream.
# I would do standard: find . -type f -print0 |
fd -H0t f . |
while IFS= read -r file; do
echo "$line"
done
To handle a stream separated by 3 slashes, you would replaces 3 slashes by a zero byte, and then read zero separated stream. This will potentially be a problem if your stream has actual zero byte in it, but that's in case of filenames impossible.
... |
sed 's|///|\x00|g' |
while IFS= read -r file; do