Home > Software design >  Capturing after the nth occurrence of a string using regex
Capturing after the nth occurrence of a string using regex

Time:04-13

My test string:

/custom-heads/food-drinks/51374-easter-bunny-cake

I am trying to capture the number in the string. The constants in that string are the the number is always preceded by 3 /'s and followed by a -.

I am a regex noob and am struggling with this. I cobbled together (\/)(.*?)(-) and then figured I could get the last one programmatically, but I would really like to understand regex better and would love if someone could show me the regex to get the last occurrence of numbers between / and -.

CodePudding user response:

Don't use regexes if possible, i reccomend you to read - https://blog.codinghorror.com/regular-expressions-now-you-have-two-problems/ blog post

To your question, its easier, faster, more bullet proof to get it using splits

const articleName = "/custom-heads/food-drinks/51374-easter-bunny-cake".split("/")[3]
// '51374-easter-bunny-cake'

const articleId = articleName.split("-")[0]

// '51374'

hope it helps

CodePudding user response:

You may use this regex with a capture group:

^(?:[^\/]*\/){3}([^-] )

Or in modern browsers you can use lookbehind assertion:

/(?<=^(?:[^\/]*\/){3})[^-] /

RegEx Demo 1

RegEx Demo 2

RegEx Code:

  • ^: Start
  • (?:[^\/]*\/){3}: Match 0 or more non-/ characters followed by a /. Repeat this group 3 times
  • ([^-] ): Match 1 of non-hyphen characters

Code:

const s = `/custom-heads/food-drinks/51374-easter-bunny-cake`;

const re = /^(?:[^\/]*\/){3}([^-] )/;

console.log (s.match(re)[1]);

CodePudding user response:

Use

const str = `/custom-heads/food-drinks/51374-easter-bunny-cake`
const p = /(?:\/[^\/]*){2}\/(\d )-/
console.log(str.match(p)?.[1])

See regex proof.

EXPLANATION

Non-capturing group (?:\/[^\/]*){2}
{2} matches the previous token exactly 2 times
\/ matches the character / with index 4710 (2F16 or 578) literally (case sensitive)
  Match a single character not present in the list below [^\/]
  * matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
  \/ matches the character / with index 4710 (2F16 or 578) literally (case sensitive)
\/ matches the character / with index 4710 (2F16 or 578) literally (case sensitive)
1st Capturing Group (\d )
  \d matches a digit (equivalent to [0-9])
    matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
  • Related