Home > database >  Regular expression in rename unix command
Regular expression in rename unix command

Time:11-07

I am trying to rename some files and am pretty new to regular expression. I know how to do this the long way, but am trying some code golf to shorten it up.

my file:

abc4800_12_S200_R1_001.fastq.gz

my goal:

abc4800_12_R1.fastq.gz

right now I have a two step process for renaming it:

rename 's/_S[0-9] //g' *gz
rename 's/_001//g' *gz

But I was trying to shorten this into one single line to clean it up in one go.

I was trying to use regular expression to skip over the parts in between, but dont if that is actually a possibility in this function.

rename 's/_S[0-9] _*?_001//g' *gz

Thanks for any help

CodePudding user response:

Use a capture group to preserve the middle part of the segment you're replacing.

rename 's/_S\d _(.*)_001/_$1/' *gz

CodePudding user response:

With your shown samples, please try following rename command. I am using -n option here which is a dry run for command, once you are Happy with output(like how files are going to rename if we run actual code) then remove -n option from following rename code.

rename -n 's/(^[^_]*_[^_]*)_[^_]*(_[^_]*)[^.]*(\..*$)/$1$2$3/' *.gz

Output will be as follows:

rename(abc4800_12_S200_R1_001.fastq.gz, abc4800_12_R1.fastq.gz)

Explanation: Adding detailed explanation for above.

(^[^_]*_[^_]*)  ##Creating 1st capturing group which capture everything from starting to just before 2nd occurrence of _ here.
_[^_]*          ##Matching(without capturing group) _ then just before next occurrence of _ here.
(_[^_]*)        ##Creating 2nd capturing group here which matches _ followed by before next occurrence of _ here.
[^.]*           ##Matching everything just before dot comes(not capturing here).
(\..*$)         ##Creating 3rd capturing group which has dot till end of line in it.

CodePudding user response:

You are trying to replace two parts in the string with nothing. Use the alternation operator, it will match the left or the right side; replacing any match with the same replacement string (i.e. nothing):

rename 's/_S[0-9] |_001//g' *gz
  • Related