Home > Mobile >  Python regex and leading 0 in capturing group
Python regex and leading 0 in capturing group

Time:08-24

I'm writing a script in python 3 to automatically rename files. But I have a problem with the captured group in a regex.

I have these kinds of files :

test tome 01 something.cbz
test tome 2 something.cbz
test tome 20 something.cbz

And I would like to have :

test 001 something.cbz
test 002 something.cbz
test 020 something.cbz

I tried several bits of code:

Example 1:

name = re.sub('tome [0]{0,1}(\d{1,})', str('\\1').zfill(3), name)

The result is:

test 01 something.cbz
test 02 something.cbz
test 020 something.cbz

Example 2:

name = re.sub('tome (\d{1,})', str('\\1').lstrip("0").zfill(3), name)

The result is:

test 001 something.cbz
test 02 something.cbz
test 020 something.cbz

CodePudding user response:

You can run the zfill(3) on the .group(1) value after stripping the zeroes from the left side:

import re

s = ("test tome 01 something.cbz\n"
            "test tome 2 something.cbz\n"
            "test tome 20 something.cbz")

result = re.sub(
    r'tome (\d )',
    lambda x: x.group(1).lstrip("0").zfill(3),
    s
)
print(result)

Output

test 001 something.cbz
test 002 something.cbz
test 020 something.cbz

CodePudding user response:

Try str.format:

import re

s = """\
test tome 01 something.cbz
test tome 2 something.cbz
test tome 20 something.cbz"""

pat = re.compile(r"tome (\d )")

s = pat.sub(lambda g: "{:>03}".format(g.group(1)), s)
print(s)

Prints:

test 001 something.cbz
test 002 something.cbz
test 020 something.cbz

CodePudding user response:

You can use zfill like this on a lambda:

import re

arr = ['test tome 01 something.cbz', 'test tome 2 something.cbz', 'test tome 20 something.cbz']

rx = re.compile(r'tome (\d )')
for s in arr:
   print ( rx.sub(lambda m: m[1].zfill(3), s) )

Output:

test 001 something.cbz
test 002 something.cbz
test 020 something.cbz

CodePudding user response:

Just to add another alternative:

import re

s_in = ("test tome 01 something.cbz\n"
     "test tome 2 something.cbz\n"
     "test tome 20 something.cbz")

s_out = re.sub(r'tome (\d )', lambda x: ('00' x[1])[-3:], s_in)
print(s_out)

Prints:

test 001 something.cbz
test 002 something.cbz
test 020 something.cbz
  • Related