Home > Blockchain >  Regex to get last part of url without appended version and parameters
Regex to get last part of url without appended version and parameters

Time:07-26

Hi guys I've got a very specific request where I would like to get the last part of a url without the parameters but if the name of the script has a version appended, like -V2, where the 2 could be any number, the regex would ignore it.

So far I found this (?!\/)(\w )(?=.js) but it is only getting a single word.

Some examples:

https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript-V2.js?x=123&name=bo-b https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript.js?x=123&name=bo-b https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript.js https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript-v2.js

All should match sampleScript

CodePudding user response:

/\/((.(?!\/)) ?)(-v\d|)\.js/i
  • \/ matches the character /

  • 1st Capturing Group ((.(?!\/)) ?)

    • 2nd Capturing Group (.(?!\/)) ?

      . matches any character

      ? matches the previous token between one and unlimited times, as few times as possible, expanding as needed (lazy)

      Negative Lookahead (?!\/) Assert that the Regex below does not match : \/ matches the character /

  • 3rd Capturing Group (-v\d|)

    • 1st Alternative

      -v matches the characters -v \d matches a digit (equivalent to [0-9])

    • 2nd Alternative

      null, matches any position

  • \. matches the character .

  • js matches the characters js

  • Global pattern flags i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])

const urls = [
  'https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript-V2.js?x=123&name=bo-b',
  'https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript.js?x=123&name=bo-b',
  'https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript.js', 
  'https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript-v2.js'
];

const regexp = /\/((.(?!\/)) ?)(-v\d|)\.js/i;

urls.forEach(url => console.log(regexp.exec(url)[1]));

CodePudding user response:

You might use:

.*\/((?:(?!-[Vv]\d \b)[^\s\/])*)\.js\b

Explanation

  • .*\/ Match till the last occurrence of /
  • ( capture group 1
    • (?: Non capture group
      • (?!-[Vv]\d \b) Assert not -v followed by digits to the right
      • [^\s\/] Match any non whitespace char except /
    • )* Close non capture group and optionally repeat
  • ) Close group 1
  • \.js\b Match .js followed by a word boundary

Regex demo

const regex = /.*\/((?:(?!-[Vv]\d \b)[^\s\/])*)\.js\b/;

[
  "https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript-V2.js?x=123&name=bo-b",
  "https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript.js?x=123&name=bo-b",
  "https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript.js",
  "https://s3.amazon-aws.com/bob.success.com/scripts/sampleScript-v2.js",
].forEach(s => {
  const m = s.match(regex);
  if (m) {
    console.log(m[1]);
  }
})

Another option with a lookahead only:

 .*\/((?!\w*-[vV]\d \b)[^\s\/]*)\.js\b

Regex demo

CodePudding user response:

To match just "sampleScript" in all examples, using a lookahead to match ".js" optionally preceded by "-v" or "-V" and a digit: (?<=\/)[^/] (?=(?:-[vV]\d)?\.js).

To match the entire file name and extension, just remove the lookahead: (?<=\/)[^/] (?:-[vV]\d)?\.js.

(If the regex can have the i flag, you can use v instead of [vV].)

  • Related