Home > Back-end >  sed with a regex starting with space and containing dash
sed with a regex starting with space and containing dash

Time:09-23

In bash I want to parse filenames of mp3 files where the separator between track number, artist and title is " - " (space dash space). The expected result is as follows: Title of the Track (Original Version - Long Edit)

My sed command as follows:

echo "03 - Artist name first-middle name - Title of the Track (Original Version - Long Edit)" | sed -E 's/^([^ - ]*[ - ]){2}//'

The result: Artist name first-middle name - Title of the Track (Original Version - Long Edit)

I'm stuck here and can't make " - " as one term. What am I doing wrong? Thanks for your hints!

Here sample data:

'01 - Skyway - Chillwave - Synthwave - Retrowave Mix.mp3'
'02 - Baldocaster - Astral Dive.mp3'
'05 - Jacket. and Shadowrunner - Deathtouch.mp3'
'06 - Night Drive - A Synthwave Mix.mp3'
'07 - Shadowrunner and Syst3m-Glitch - Eastbound Plane Mattaei (Original - Long Mix).mp3'

In the bash scriptI want to set the title variable as follows:

title=`echo ${filename} | sed -E "s/^([^${SEPARATOR}]*[${SEPARATOR}]){4}//"

where SEPARATOR is a variable as well containing e.g. " - "

CodePudding user response:

A solution using parameter expensions.

$ filename="03 - Artist name first-middle name - Title of the Track (Original Version - Long Edit)"
$ filename="${filename#* - * - }"
$ echo "$filename"
Title of the Track (Original Version - Long Edit)

CodePudding user response:

One way to do this with basic sed is

sed -e 's/ - \(.*\)/;;\1/' -e 's/ - \(.*\)/;;\1/' -e 's/.*;;//'

which is three commands:

  1. turn the first delimiter (between track number and artist) into ;;
  2. turn the second delimiter (between artist and track) into ;;
  3. delete everything before and including the last ;;

The reason it's tricky is that basic sed's matchers are inherently greedy, so .* always eats as many characters as possible. You can work around that by eating from the end of the line instead of the beginning.

If you want your separator to be a configurable pattern:

$ SEPARATOR=' - '
$ sed -e 's/'"$SEPARATOR"'\(.*\)/;;\1/' -e 's/'"$SEPARATOR"'\(.*\)/;;\1/' -e 's/.*;;//' data
Chillwave - Synthwave - Retrowave Mix.mp3
Astral Dive.mp3
Deathtouch.mp3
A Synthwave Mix.mp3
Eastbound Plane Mattaei (Original - Long Mix).mp3

CodePudding user response:

Using sed and a capture group:

sed -E 's/^[0-9]  - (. )/\1/' file

Output

Artist name first-middle name - Title of the Track (Original Version - Long Edit)
Skyway - Chillwave - Synthwave - Retrowave Mix.mp3
Baldocaster - Astral Dive.mp3
Jacket. and Shadowrunner - Deathtouch.mp3
Night Drive - A Synthwave Mix.mp3
Shadowrunner and Syst3m-Glitch - Eastbound Plane Mattaei (Original - Long Mix).mp3

If you want to match the pattern with the parenthesis and the .mp3 extension only:

sed -En 's/^[0-9]  - (.*\([^()] \))\.mp3$/\1/p' file

Output

Shadowrunner and Syst3m-Glitch - Eastbound Plane Mattaei (Original - Long Mix)

CodePudding user response:

Using sed

$ title=$(sed -E 's/.*- ([[:alpha:] ] (\([^)]*\))?\..*)/\1/' input_file)
$ echo "$title"
Retrowave Mix.mp3'
Astral Dive.mp3'
Deathtouch.mp3'
A Synthwave Mix.mp3'
Eastbound Plane Mattaei (Original - Long Mix).mp3'

CodePudding user response:

With awk using a space dash space as a separator.

awk -F' - ' -v OFS=' - ' '{print substr($0, index($0, $2))}'
  • Related