Reading a book about bash and it was introducing regular expressions(I'm pretty new to them) with an example:
rename -n 's/(.*)(.*)/new$1$2/' *
'file1' would be renamed to 'newfile1'
'file2' would be renamed to 'newfile2'
'file3' would be renamed to 'newfile3'
There wasn't really a breakdown provided with this example, unfortunately. I kind of get what capture groups are and that .* is greedy and will match all characters but I'm uncertain as to why two capture groups are needed. Also, I get that $ represents the end of the line but am unsure of what $1$2 is actually doing here. Appreciate any insight provided.
Attempted to research capture groups and the $ for some similar examples with explanations but came up short.
CodePudding user response:
You are correct. (.*)(.*)
makes no sense. The second .*
will always match the empty string.
For example, matching against file
,
- the first
.*
will match the 4 character string starting at position 0 (file
), and - the second
.*
will match the 0 character string starting at position 4 (empty string).
You could simply the pattern to
rename -n 's/(.*)/new$1/' *
rename -n 's/.*/new$&/' *
rename -n 's/^/new/' *
rename -n '$_ = "new$_"' *
rename -n '$_ = "new" . $_' *
CodePudding user response:
I don't know that rename
command. The regular expression looks like sed
syntax. If that is the case (as in many other regex forms), it has 3 parts:
s
for substitute- everything between the first two slashes
(.*)(.*)
to specify what to match - everything between the 2nd and 3rd slash
new$1$2
is the replacement
$
only mean end of the line on the first part of the regular expression. On the second part $
number refers to the capture groups, $1
is the first group, $2
the second, and so on, with $0
often being the whole matched text.
You are right that .*
is greedy and it's pointless to have that repeated. Maybe there was a \.
in between and that was an attempt to capture file name and extension. There are better ways to parse file names, like basename
. So you could simplify the command to rename -n 's/(.*)/new$1/' *