Home > OS >  Stripping the leading zeros but leave a single 0
Stripping the leading zeros but leave a single 0

Time:02-14

So let me start of by saying that I'm new to bash so I would appreciate a simple explanation on the answers you give.

I've got the following block of code:

name="Chapter 0000.cbz (sub s2)"

s=$(echo $name | grep -Eo '[0-9] ([.][0-9] )?' | tr '\n' ' ' | sed 's/^0*//')

echo $s

readarray -d " " -t myarr <<< "$s"

if [[ $(echo "${myarr[0]} < 100 && ${myarr[0]} >= 10" | bc) -ne 0 ]]; then
    myarr[0]="0${myarr[0]}"
elif [[ $(echo "${myarr[0]} < 10" | bc) -ne 0 ]]; then
    myarr[0]="00${myarr[0]}"
fi

newName="Chapter ${myarr[0]}.cbz"

echo $newName

which (in this case) would end up spitting out:

 2
(standard_in) 1: syntax error
(standard_in) 1: syntax error
Chapter .cbz

(I'm fairly certain that the syntax errors are because ${myarr[0]} is null when doing the comparisons)

This is not the output I want. I want the code to strip leading 0's but leave a single 0 if its all 0.

So the code to really change would be sed 's/^0*//') but I'm not sure how to change it.

(expected outputs:

              in   --->   out
1) chapter 8.cbz   ---> Chapter 008.cbz
2) chapter 1.3.cbz   ---> Chapter 001.3.cbz
3) _23 (sec 2).cbz   ---> Chapter 023.cbz
4) chapter 00009.cbz   ---> Chapter 009.cbz
5) chap 0000112.5.cbz   ---> Chapter 112.5.cbz

so far the code I got works for 1- 3 but not the leading 0 cases (4 -5 ))

CodePudding user response:

I think you could implement the table of results by sed alone:

sed '
    s/^[^0-9]*/000/
    s/[^0-9.].*$/./
    s/\.*$/.cbz/
    s/^0*\([0-9]\{3\}\)/Chapter \1/
' <<'EOD'
chapter 8.cbz
chapter 1.3.cbz
_23 (sec 2).cbz
chapter 00009.cbz
chap 0000112.5.cbz
chap 04567.cbz
EOD
  • The first command strips everything before the first number and prepends zeros to ensure there are at least three digits.
  • The second command replaces everything after the number with a single period.
  • Because the number may contain a period but may also be followed by a period, a third command replaces all the trailing periods with the desired extension.
  • The final command removes the longest run of leading zeroes that leaves (at least) three digits (I added an extra test case to demonstrate).

Result of running this would be:

Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
Chapter 4567.cbz

CodePudding user response:

In pure bash:

#!/bin/bash

for name in 'chapter 8.cbz' 'chapter 1.3.cbz' '_23 (sec 2).cbz' 'chapter 00009.cbz' 'chap 0000112.5.cbz'; do

##### The relevant part #####

[[ $name =~ ^[^0-9]*([0-9] )[^.]*(\..*)$ ]]

chapter=$(( 10#${BASH_REMATCH[1]} ))
suffix=${BASH_REMATCH[2]}

printf 'Chapter d%s\n' "$chapter" "$suffix"

#############################

done
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz

notes:

  • [[ =~ ]] is the way to use an ERE regex in bash. The one that I wrote has two capturing groups: the first one captures the first appearing sequence of digits (which should be the chapter number), and the second one, all the characters that appear after the first dot (included).
  • $(( 10#... )) converts a zero prefixed decimal to a normal decimal.
  • printf 'd' converts a number to a decimal of at least 3 digits, padding the left with zeros when it's not the case.

CodePudding user response:

Using sed

$ sed 's/[^0-9]*0\ \?\([0-9]\{1,\}\)[^.]*\(\..*\)/Chapter 00\1\2/;s/0\ \([0-9]\{3,\}\)/\1/' file
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz

s/[^0-9]*0\ \?\([0-9]\{1,\}\)[^.]*\(\..*\)/Chapter 00\1\2/ - Strip everything up to a digit that is not zero, then add Chapter at the beginning as well as 2 zero after stripping the initial zeros.

s/0\ \([0-9]\{3,\}\)/\1/ - Once again, strip excess zeros ensuring only three digits before the period remain.

CodePudding user response:

Here is an awk script that does the trick:

script.awk

{
  str = "000" gensub("(^[[:digit:]] \\.?[[:digit:]]*)( \\([^)] \\))?(\\.cbz)", "\\1", "g", RT);
  str = gensub("(^[[:digit:]] )([[:digit:]]{3})(.*$)", "\\2\\3", "g", str);
  printf("Chapter %s.cbz\n", str);
}

Test input.1.txt

1) chapter 8.cbz   
2) chapter 1.3.cbz 
3) _23 (sec 2).cbz 
4) chapter 00009.cbz
5) chap 0000112.5.cbz

Output:

awk -f script.awk RS='[[:digit:]] [\\.]?[[:digit:]]*( \\([^)] \\))?\\.cbz' input.1.txt
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
  • Related