So let me start of by saying that I'm new to bash so I would appreciate a simple explanation on the answers you give.
I've got the following block of code:
name="Chapter 0000.cbz (sub s2)"
s=$(echo $name | grep -Eo '[0-9] ([.][0-9] )?' | tr '\n' ' ' | sed 's/^0*//')
echo $s
readarray -d " " -t myarr <<< "$s"
if [[ $(echo "${myarr[0]} < 100 && ${myarr[0]} >= 10" | bc) -ne 0 ]]; then
myarr[0]="0${myarr[0]}"
elif [[ $(echo "${myarr[0]} < 10" | bc) -ne 0 ]]; then
myarr[0]="00${myarr[0]}"
fi
newName="Chapter ${myarr[0]}.cbz"
echo $newName
which (in this case) would end up spitting out:
2
(standard_in) 1: syntax error
(standard_in) 1: syntax error
Chapter .cbz
(I'm fairly certain that the syntax errors are because ${myarr[0]}
is null when doing the comparisons)
This is not the output I want. I want the code to strip leading 0's but leave a single 0 if its all 0.
So the code to really change would be sed 's/^0*//')
but I'm not sure how to change it.
(expected outputs:
in ---> out
1) chapter 8.cbz ---> Chapter 008.cbz
2) chapter 1.3.cbz ---> Chapter 001.3.cbz
3) _23 (sec 2).cbz ---> Chapter 023.cbz
4) chapter 00009.cbz ---> Chapter 009.cbz
5) chap 0000112.5.cbz ---> Chapter 112.5.cbz
so far the code I got works for 1- 3 but not the leading 0 cases (4 -5 ))
CodePudding user response:
I think you could implement the table of results by sed alone:
sed '
s/^[^0-9]*/000/
s/[^0-9.].*$/./
s/\.*$/.cbz/
s/^0*\([0-9]\{3\}\)/Chapter \1/
' <<'EOD'
chapter 8.cbz
chapter 1.3.cbz
_23 (sec 2).cbz
chapter 00009.cbz
chap 0000112.5.cbz
chap 04567.cbz
EOD
- The first command strips everything before the first number and prepends zeros to ensure there are at least three digits.
- The second command replaces everything after the number with a single period.
- Because the number may contain a period but may also be followed by a period, a third command replaces all the trailing periods with the desired extension.
- The final command removes the longest run of leading zeroes that leaves (at least) three digits (I added an extra test case to demonstrate).
Result of running this would be:
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
Chapter 4567.cbz
CodePudding user response:
In pure bash:
#!/bin/bash
for name in 'chapter 8.cbz' 'chapter 1.3.cbz' '_23 (sec 2).cbz' 'chapter 00009.cbz' 'chap 0000112.5.cbz'; do
##### The relevant part #####
[[ $name =~ ^[^0-9]*([0-9] )[^.]*(\..*)$ ]]
chapter=$(( 10#${BASH_REMATCH[1]} ))
suffix=${BASH_REMATCH[2]}
printf 'Chapter d%s\n' "$chapter" "$suffix"
#############################
done
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
notes:
[[ =~ ]]
is the way to use an ERE regex in bash. The one that I wrote has two capturing groups: the first one captures the first appearing sequence of digits (which should be the chapter number), and the second one, all the characters that appear after the first dot (included).$(( 10#... ))
converts a zero prefixed decimal to a normal decimal.printf 'd'
converts a number to a decimal of at least 3 digits, padding the left with zeros when it's not the case.
CodePudding user response:
Using sed
$ sed 's/[^0-9]*0\ \?\([0-9]\{1,\}\)[^.]*\(\..*\)/Chapter 00\1\2/;s/0\ \([0-9]\{3,\}\)/\1/' file
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
s/[^0-9]*0\ \?\([0-9]\{1,\}\)[^.]*\(\..*\)/Chapter 00\1\2/
- Strip everything up to a digit that is not zero, then add Chapter
at the beginning as well as 2 zero after stripping the initial zeros.
s/0\ \([0-9]\{3,\}\)/\1/
- Once again, strip excess zeros ensuring only three digits before the period remain.
CodePudding user response:
Here is an awk
script that does the trick:
script.awk
{
str = "000" gensub("(^[[:digit:]] \\.?[[:digit:]]*)( \\([^)] \\))?(\\.cbz)", "\\1", "g", RT);
str = gensub("(^[[:digit:]] )([[:digit:]]{3})(.*$)", "\\2\\3", "g", str);
printf("Chapter %s.cbz\n", str);
}
Test input.1.txt
1) chapter 8.cbz
2) chapter 1.3.cbz
3) _23 (sec 2).cbz
4) chapter 00009.cbz
5) chap 0000112.5.cbz
Output:
awk -f script.awk RS='[[:digit:]] [\\.]?[[:digit:]]*( \\([^)] \\))?\\.cbz' input.1.txt
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz