So the text is the following:
1a fost odata
2un balaur
care fura
mere de aur
and after using this command:
sed 's/\([a-z]*\)\(.*\)\( [a-z]*\)/\1 ... \2/' filename
the result is this:
... 1a fost
... 2un balaur
care ...
mere ... de
I know that \1
is for the first [a-z]*
subexpression and so on, but I just can't figure this out.. also, what's the difference between the first subexpression and the last one? why is there a space before [a-z]
?
CodePudding user response:
The first [a-z]*
matches the first sequence of letters on the line. The *
quantifier matches 0 or more repetitions, so this can also match an empty string.
On the first line it matches the empty string before 1a
. On the second line it matches the empty string before 2un
. On the third line it matches care
, and on the fourth line it matches mere
. These matches will go into capture group 1.
.*
matches zero or more of any characters, so this will skip over everything in the middle of the line. These matches go into capture group 2.
[a-z]*
matches a space followed by zero or more letters. The space is needed to make .*
stop matching when it gets to the last space on the line. These matches go into capture group 3.
The replacement is capture groups 1 and 2 with ...
between them. This is the letters at the beginning of the line, ...
, then everything after that except the last word.