Remove comma from last element in each block-CodePudding

I've got a file with the following contents, and want to remove the last comma (in this case, the comma after the 'c' and 'f').

heading1(
a,
b,
c,
);

some more text

heading2(
d,
e,
f,
);

This has to be used using bash and not Perl or Python etc as these are not installed on my target system. I can use sed, awk etc, but I cannot use sed with the -z argument as I'm using an old version of the utility.

So sed -zi 's/,\n);/\n);/g' $file is off the table.

Any help would be greatly appreciated. Thanks

CodePudding user response：

This might work in your version of sed. Then again it might not.

sed 'x;1d;G;/;$/s/,//;$!s/\n.*//' $file

Rough translation: "Swap this line with the hold space. If this is the first line, do no more with it. Append the hold space to the line in the buffer (so that you're looking at the last line and the current one). If what you have ends with a semicolon, delete the comma. If you're not on the last line of the file, delete the second of the two lines you have (i.e. the current line, which we'll deal with after we see the next one)."

CodePudding user response：

Using awk, RS="^$" to read in the whole file and regex to replace parts of the text:

$ awk -v RS=^$ '{gsub(/,\n\);/,"\n);")}1' file

Some output:

heading1(
a,
b,
c
);
...

CodePudding user response：

I would make use of the "hold space" in sed, so that you can look ahead one line to see if its the ')'.

All that follows is a "sed script"; so just put '' around it and "sed" in front of it:

  sed '

start by unconditionally holding the first line, and deleting it (forcing a skip to the next line)

for each line that starts with ')', swap the current and hold buffers (so you now have the previous line in the current buffer), remove the trailing comma (if any), and swap again:

    /^)/ {
      x
      s/,$//
      x
    }

swap, so that we will print the previous line

At the last line, print the previous line, then retrieve the last line and print it too. Also remove a trailing comma from the last line.

    $ {
      p
      x
      s/,$//
    }

Upon reaching the end of the sed script, the current buffer will be printed; so there's no "p" at the end.

As mentioned before, close the quote from the beginning.

Taken all together we get:

  sed '
    1 {
      h
      d
    }
    /^)/ {
      x
      s/,$//
      x
    }
    x
    $ {
      p
      x
      s/,$//
    }
  '

If you need to scan ahead more than one line, instead of "x" to swap one line, use "H;g" to append to the hold space and then copy the hold space to the current buffer, then "P;D" to print and remove up to the first newline. (H, P & D are GNU extensions.)

CodePudding user response：

This should work with GNU sed and BSD sed on the shown input:

sed -e ':a' -e '/,\n);$/!{N' -e 'ba' -e '}' -e 's/,\n);$/\n);/' file.txt

We concatenate lines in the pattern space until it ends with ,\n);. Then we delete the comma, print (the default) and restart the cycle with a new line.

Simpler and more readable version with GNU sed (that you do not have):

sed ':a;/,\n);$/!{N;ba};s/,\n);$/\n);/' file.txt

CodePudding user response：

awk 'BEGIN { getline line } /^);/ { sub(/,$/, "", line) } { print line; line = $0 } END { print line }'

CodePudding user response：

Using awk:

awk '
$0==");" {sub(/,$/, "", l)}
FNR!=1 {print l}
{l=$0}
END {print l}'