Home > Software engineering >  Convert regex positive look ahead to sed operation
Convert regex positive look ahead to sed operation

Time:05-10

I would like to sed to find and replace every occurrence of - with _ but only before the first occurrence of = on every line.

Here is a dataset to work with:

ke-y_0-1="foo"
key_two="bar"
key_03-three="baz-jazz-mazz"
key-="rax_foo"
key-05-five="craz-"

In the end the dataset should look like this:

ke_y_0_1="foo"
key_two="bar"
key_03_three="baz-jazz-mazz"
key_="rax_foo"
key_05_five="craz-"

I found this regex will match properly.

\-(?=.*=)

However the regex uses positive lookaheads and it appears that sed (even with -E, -e or -r) dose not know how to work with positive lookaheads.

I tried the following but keep getting Invalid preceding regular expression

cat dataset.txt | sed -r "s/-(?=.*=)/_/g"

Is it possible to convert this in a usable way with sed?

Note, I do not want to use perl. However I am open to awk.

CodePudding user response:

You can use

sed ':a;s/^\([^=]*\)-/\1_/;ta' file

See the online demo:

#!/bin/bash
s='ke-y_0-1="foo"
key_two="bar"
key_03-three="baz-jazz-mazz"
key-="rax_foo"
key-05-five="craz-"'
sed ':a; s/^\([^=]*\)-/\1_/;ta' <<< "$s"

Output:

ke_y_0_1="foo"
key_two="bar"
key_03_three="baz-jazz-mazz"
key_="rax_foo"
key_05_five="craz-"

Details:

  • :a - setting a label named a
  • s/^\([^=]*\)-/\1_/ - find any zero or more chars other than a = char from the start of string (while capturing into Group 1 (\1)) and then matches a - char, and replaces with Group 1 value (\1) and a _ (that replaces the found - char)
  • ta - jump to lable a location upon successful replacement. Else, stop.

CodePudding user response:

You might also use awk setting the field separator to = and replace all - with _ for the first field.

To print only the replaced lines:

awk 'BEGIN{FS=OFS="="}gsub("-", "_", $1)' file

Output

ke_y_0_1="foo"
key_03_three="baz-jazz-mazz"
key_="rax_foo"
key_05_five="craz-"

If you want to print all lines:

awk 'BEGIN{FS=OFS="="}{gsub("-", "_", $1);print}' file
  • Related