Home > Mobile >  How to match multiple filename patterns and batch rename in a folder traversing bash script
How to match multiple filename patterns and batch rename in a folder traversing bash script

Time:08-05

I'm trying to automate a task that, until now, I've manually performed with the rename utility. I don't have a lot of experience with bash, though, so I'm struggling to wrap my head around the complexity of it.

I have unsorted comics (.cbz files) which may take the following naming styles (actual examples - I may have missed some):

/Collection
├── /Title1
│   ├── _66.cbz
│   └── _Chapter 67.cbz
├── /Title2
│   ├── Chapter 117.cbz
│   └── Chapter 118 - Name.cbz
├── /Title3
│   ├── foo bar_Ch.10 - Name.cbz
│   ├── foo_Ch.21.cbz
│   └── foo_Ch.22.cbz
├── /Title4
│   ├── _Chapter 72_Voluminous.cbz
│   └── _Chapter 73_Final Chapter.cbz
└── /Title5
    ├── bar_Ch.58.2.cbz
    └── bar_Vol.11 Ch.58.1.cbz

As can be seen, the structure is a complete mess with no congruity between unsorted folders.
The general ruleset I've cooked up is the following:
(Feel free to change things up to make it work better)

  1. If .*Vol\. is matched, remove everything before then i.e. replace with Vol\. (May also include 'Volume.2' but I don't recall seeing that)
  2. Elif .*Chapter is matched, replace with Ch\.
  3. Elif .*Ch\. replace with Ch\.
  4. Elif nothing but the chapter number (/Title1/_66.cbz) precede it with Ch.
  5. Else echo error
  6. Replace _ with - (space padded)

...for every *.cbz file in the directory

This should result in the following output:

/Collection
├── /Title1
│   ├── Ch.66.cbz
│   └── Ch.67.cbz
├── /Title2
│   ├── Ch.117.cbz
│   └── Ch.118 - Name.cbz
├── /Title3
│   ├── Ch.10 - Name.cbz
│   ├── Ch.21.cbz
│   └── Ch.22.cbz
├── /Title4
│   ├── Ch.72 - Voluminous.cbz
│   └── Ch.73 - Final Chapter.cbz
└── /Title5
    ├── Ch.58.2.cbz
    └── Vol.11 Ch.58.1.cbz

I've tried a few things so far, but nothing with this large scope. Note that some of the chapter names may include Vol or Cha.

The remaining parts of this I can probably solve myself. They include things like having the script apply to the contents of every /title in /collection so I only need to run it once from the parent directory.


Seems like the final script unless I can think of any other tweaks.

for f in *.cbz; do 
if [[ $f =~ Vol\.[0-9] ]]; then 
out=$(echo "$f" | sed s/.*Vol/Vol/);
elif [[ $f =~ Chapter(\.| )[0-9] ]]; then
out=$(echo "$f" | sed s/.*Chapter./Ch\./);
elif [[ $f =~ Ch\.[0-9] ]]; then 
out=$(echo "$f" | sed s/.*Ch\./Ch\./);
elif [[ $f =~ ^_[0-9] ]]; then 
out=$(echo "$f" | sed s/_/Ch\./);
else echo "ERR: $f";
fi; if ! [[ -z "$out" ]]; then
if [[ $out != $f ]]; then
mv "$f" "$out"; fi; fi; done; for g in *.cbz; do
if [[ $g =~ _ ]]; then 
mv "$g" "$(echo "$g" | sed -e 's/_/ - /')"; fi; done;

CodePudding user response:

Rule 3 will overwrite rule 1, unless you have more specific rule to define how escape it.

You can use regular pattern command to generate a list of commands to rename these files.

For example, using regreplace function in rquery to generate the renaming commands.

[ rquery]$ ls Collection/*/* |rq -q "s 'mv \"' @1 '\" \"' replace(regreplace(regreplace(regreplace(regreplace(@1,'[^/]*Vol\.','Vol.'),'[^/]*Chapter ','Ch.'),'[^/]*Ch\.','Ch.'),'/_([0-9])','/Ch.\$1'),'_',' - ') '\"'"
mv "Collection/Title1/_66.cbz" "Collection/Title1/Ch.66.cbz"
mv "Collection/Title1/_Chapter 67.cbz" "Collection/Title1/Ch.67.cbz"
mv "Collection/Title2/Chapter 117.cbz" "Collection/Title2/Ch.117.cbz"
mv "Collection/Title2/Chapter 118 - Name.cbz" "Collection/Title2/Ch.118 - Name.cbz"
mv "Collection/Title3/foo bar_Ch.10 - Name.cbz" "Collection/Title3/Ch.10 - Name.cbz"
mv "Collection/Title3/foo_Ch.21.cbz" "Collection/Title3/Ch.21.cbz"
mv "Collection/Title3/foo_Ch.22.cbz" "Collection/Title3/Ch.22.cbz"
mv "Collection/Title4/_Chapter 72_Voluminous.cbz" "Collection/Title4/Ch.72 - Voluminous.cbz"
mv "Collection/Title4/_Chapter 73_Final Chapter.cbz" "Collection/Title4/Ch.73 - Final Chapter.cbz"
mv "Collection/Title5/bar_Ch.58.2.cbz" "Collection/Title5/Ch.58.2.cbz"
mv "Collection/Title5/bar_Vol.11 Ch.58.1.cbz" "Collection/Title5/Ch.58.1.cbz"

Check out the latest rquery from here https://github.com/fuyuncat/rquery

If you prefer to awk, here is the awk example,

ls Collection/*/* | awk '{a=gensub(/[^/]*Vol\./,"Vol.","g",$0);a=gensub(/[^/]*Chapter /,"Ch.","g",a);a=gensub(/[^/]*Ch\./,"Ch.","g",a);a=gensub(/\/_([0-9])/,"/Ch.\\1","g",a);gsub("_"," - ",a);print "mv \""$0"\" \""a"\""}'

CodePudding user response:

It is possible to make a regex that directly matches everything that you need, but it's complicated and unmaintainable. In that sense you're right to decompose the problem in smaller parts. You shouldn't be using sed tho, bash can capture the data directly.

The following solution has several border cases but it should work for your purpose:

#!/bin/bash
for file in \
    Ch.33/_66.cbz '_Chapter 67.cbz' 'Chapter 117.cbz' 'Chapter 118 - Name.cbz' \
    'foo bar_Ch.10 - Name.cbz' foo_Ch.21.cbz foo_Ch.22.cbz '_Chapter 72_Voluminous.cbz' \
    '_Chapter 73_Final Chapter.cbz' bar_Ch.58.2.cbz 'bar_Vol.11 Ch.58.1.cbz'
do
    # split the file(path) into its different components so that
    # the directory name and file extension don't get in the way
    [[ $file =~ ^(.*/)?(.*)(\..*)$ ]]
    dirname=${BASH_REMATCH[1]}
    filename=${BASH_REMATCH[2]}
    extension=${BASH_REMATCH[3]}

    [[ $filename =~ Vol(ume)?[.\ ]([0-9] ) ]] &&
    volume=${BASH_REMATCH[2]}

    [[ ${volume X} ]] && filename=${filename#*"${BASH_REMATCH[0]}"}

    [[ $filename =~ Ch(apter)[.\ ]([0-9] (\.[0-9] )*)|([0-9] (\.[0-9] )*) ]] &&
    chapter=${BASH_REMATCH[2]}${BASH_REMATCH[4]}

    [[ ${chapter X} ]] || {
        printf 'illegal filename: %q\n' "$file" 1>&2
        continue
    }
    filename=${filename#*"${BASH_REMATCH[0]}"}

    [[ $filename =~ [_\ ] (-\  )?(.*)$ ]] &&
    title=${BASH_REMATCH[2]}

    filename=${volume: Vol."$volume" }Ch.$chapter${title:  - "$title"}
    printf '%q -> %q\n' "$file" "$dirname$filename$extension"
    #mv "$file" "$dirname$filename$extension"
done
Ch.33/_66.cbz -> Ch.33/Ch.66.cbz
_Chapter\ 67.cbz -> Ch.67.cbz
Chapter\ 117.cbz -> Ch.117.cbz
Chapter\ 118\ -\ Name.cbz -> Ch.118\ -\ Name.cbz
foo\ bar_Ch.10\ -\ Name.cbz -> Ch.10\ -\ Name.cbz
foo_Ch.21.cbz -> Ch.21.cbz
foo_Ch.22.cbz -> Ch.22.cbz
_Chapter\ 72_Voluminous.cbz -> Ch.72\ -\ Voluminous.cbz
_Chapter\ 73_Final\ Chapter.cbz -> Ch.73\ -\ Final\ Chapter.cbz
bar_Ch.58.2.cbz -> Ch.58.2.cbz
bar_Vol.11\ Ch.58.1.cbz -> Vol.11\ Ch.58.1.cbz
  •  Tags:  
  • bash
  • Related