I'm stumped trying to figure out a regex expression. Given a file path, I need to match the last numerical component of the path ("frame" number in an image sequence), but also ignore any numerical component in the file extension.
For example, given path:
/path/to/file/abc123/GCAM5423.xmp
The following expression will correctly match 5423
.
((?P<index>(?P<padding>0*)\d )(?!.*(0*)\d ))
However, this expression fails if for example the file extension contains a number as follows:
/path/to/file/abc123/GCAM5423.cr2
In this case the expression will match the 2
in the file extension, when I still need it to match 5423
. How can I modify the above expression to ignore file extensions that have a numerical component?
Using python flavor of regex. Thanks in advance!
CodePudding user response:
You can try this one:
\/[a-zA-Z]*(\d*)\.[a-zA-Z0-9]{3,4}$
CodePudding user response:
Step1: Find substring before last dot.
(.*)\.
Input: /path/to/file/abc123/GCAM5423.cr2
Output: /path/to/file/abc123/GCAM5423
Step2: Find the last numbers using your regex.
Input: /path/to/file/abc123/GCAM5423
Output: 5423
I don't know how to join these two regexs, but it also usefult for you. My hopes^_^
CodePudding user response:
Try this pattern:
\/[^/\d\s] (\d )\.[^/] $
See Regex Demo
Code:
import re
pattern = r"\/[^/\d\s] (\d )\.[^/] $"
texts = ['/path/to/file/abc123/GCAM5423.xmp', '/path/to/file/abc123/GCAM5423.cr2']
print([match.group(1) for x in texts if (match := re.search(pattern, x))])
Output:
['5423', '5423']