Home > database >  Regex: select each occurrence of a character up until another character
Regex: select each occurrence of a character up until another character

Time:09-23

I have a couple of lines in a document which looks something like that:

foo-bar-foo[Foo - Bar]

I'd like to select every - character up until the first [ bracket on every line. Thus the - in the square brackets shouldn't be selected.

How can I achieve that with a Regex?

I already have this regex /. ?(?=\[)/g, which selects every character until the first [ but I only want the -.

Edit: I wan't to replace these selected characters with the sed command (GNU).

CodePudding user response:

You can use

sed -E ':a; s/^([^[-] )-/\1/; ta'

See an online demo:

#!/bin/bash
s='foo-bar-foo[Foo - Bar]'
sed -E ':a; s/^([^[-] )-/\1/; ta' <<< "$s"
# => foobarfoo[Foo - Bar]

Details:

  • -E - enabling POSIX ERE syntax (so that there is no need to escape capturing parentheses and the quantifier)
  • :a - an a label
  • s/^([^[-] )-/\1/ - finds one or more chars other than [ and - from the start of string capturing this substring into Group 1 (\1) and then matches a - char
  • ta - jumps to a label upon a successful replacement
  • Related