Home > OS >  Can I use `sed` in place of `read` at the start of a while loop that takes input from a pipe?
Can I use `sed` in place of `read` at the start of a while loop that takes input from a pipe?

Time:11-28

I would like to process data from a pipe inside a while loop. However, the data is separated by 3 forward slash characters rather than a newline, because I would like the script to be able to handle input that has newlines in it.

# The normal `read`-driven while loop
while read -r; do
   echo "$REPLY"
done

Instead of read, I would like to use sed because it allows for more flexability with reading input. The problem I'm running into is that sed eats all the data from the pipe on the first time through the while loop, so the loop only runs once.

Here is how I'm testing this

(note 1: uses fd, a file finding utility)

# Make some files
mkdir test-{1,2}
touch 'test-1/file' $'test-2/file\nwith a newline'

# Find the files we just created
# note 2: there are some wide comments here ->>>
fd -t f . test-1/ test-2/ -Hx printf '%s///' '{}' |  # List all files in test-1 and test-2, putting "///" at the end of each one
   sed 's/./\n&/g' |  # prepend each character with a newline so it can be read by `sed` one character at a time
   while v="$(sed -n ':loop; N; s|\n\(.\)$|\1|; /\/\/\/$/!{b loop}; s|///$||; p;Q;')"; do
      #       ^^^^^^ sed script that prints everything up to the "///"
      #              that notates the end of the current path, then exits
      #              without processing more than the first one (so the
      #              rest can be processed in future loop iterations)
      if [ -n "$v" ]; then
         # print the file name
         echo "$v"
      else
         # if v is empty
         break
      fi
   done

The output of this is

test-1/file

... which indicates sed is only running once, because the output should be this:

test-1/file
test-2/file
with a newline

Is there a way to get sed to behave like read so it can be used in a while loop? Is there a magical property that allows read to do this because it's a builtin?

CodePudding user response:

I will guess that you are asking XY question.

I would like the script to be able to handle input that has newlines in it.

Use a zero separated stream.

# I would do standard: find . -type f -print0 |
fd -H0t f . |
while IFS= read -r file; do
    echo "$line"
done

To handle a stream separated by 3 slashes, you would replaces 3 slashes by a zero byte, and then read zero separated stream. This will potentially be a problem if your stream has actual zero byte in it, but that's in case of filenames impossible.

... |
sed 's|///|\x00|g' |
while IFS= read -r file; do
  • Related