I have regex that matches URL paths where there is a number in second part of the path.
(*/)([0-9] )(/*)
However, I want to modify it to only matches paths where the second part is a number and is greater than 3 digits in length/ e.g.
/abcd/12345/abcd – Matched
/abcd/123/abcd – Not matched
Is there a way to specify length limits in regex?
CodePudding user response:
A URL consists of URL-encoded ASCII characters which are not control characters (see W3Schools- URI Encoding reference).
The regex to match all printable ASCII characters (no control characters!) is the following (See this SO question):
[ -~]
Therefore assuming you want to match the whole URL you can use the following regex:
^[ -~]*\/\d{4,}\/?[ -~]*$
^
: Matches begin of a string[ -~]
: Any printable ASCII character*
: Match zero or more of the preceding token\/
: Slash, must be escaped in RegEx\d
: Regex class for digits, matches all digits 0-9{0,4}
: Matches 4 or more of the preceding token (at least three numbers)?
: Matches 0 or 1 of the preceding token (there could be a slash at the end or not, both are matched)$
: Matches end of a string
const urls = [
"/abcd/12345/abcd", // Matched
"/12345/abcd", // Matched
"/abcd/123/abcd", // Not matched - too less digits
"12345/abcd", // Not matched - must NOT start with a number (can be adjusted if required)
"/abcd/12345", // Matched - may end wiht a number (can be adjusted if required)
"/abäd/1234" // Not matched - invalid URL as 'ä' is a non-ASCII character
]
const isValidUrl = (url) => {
const match = url.match(/^[ -~]*\/\d{4,}\/?[ -~]*$/);
if(match === null) console.log(`URL ${url} does NOT match.`);
else console.log(`Match found: ${match[0]}`);
}
urls.forEach(url => isValidUrl(url));
/* StackOverflow snippet: console should overlap rendered HTML area */
.as-console-wrapper { max-height: 100% !important; top: 0; }
It's not 100% clear what exactly you want to match so you might need to adjust the regex to your needs. I suggest you use this RegEx as a starting point and use RegExr to refine it if required.
CodePudding user response:
You may use this pattern:
^/[^/] /\d{4,}
The \d{4,}
ending portion of the regex matches only on 4 digits or more. Here is a demo.