Home > database >  Regex javascript - capture the src URL, only if its a image?
Regex javascript - capture the src URL, only if its a image?

Time:11-16

I`m trying to extract the src URL/path without the quotes, only in the case it is an image:

  1. src="/path/image.png" // should capture => /path/image.png
  2. src="/path/image.bmp" // should capture => /path/image.bmp
  3. src="/path/image.jpg" // should capture => /path/image.jpg
  4. src="https://www.site1.com" // should NOT capture

So far I have /src="(.*)"/g, but that obviously captures both, I have been looking at look behind and look ahead but just can`t put it together.

CodePudding user response:

You can use a capture group, and you should prevent crossing the " using a negated character class.

If you want to match either href or src

\b(?:href|src)="([^\s"]*\.(?:png|jpg|bmp))"

Explanation

  • \b A word boundary to prevent a partial word match
  • (?:href|src)=" match either href= or src=
  • ( Capture group 1
    • [^\s"]* Match optional chars other than a whitespace char or "
    • \.(?:png|jpg|bmp) Match one of .png .jpg .bmp
  • ) Close group 1
  • " Match literally

Regex demo

const regex = /\b(?:href|src)="([^\s"]*\.(?:png|jpg|bmp))"/;
[
  'src="/path/image.png" test "',
  'src="/path/image.bmp"',
  'src="/path/image.jpg"',
  'src="https://www.site1.com"',
  'href="image.png"'
].forEach(s => {
  const m = s.match(regex);
  if (m) {
    console.log(m[1]);
  }
})

CodePudding user response:

Try /src="(.*[jpg|bmp|png])"/g

You'll need to enter in the list of extensions you consider valid images

CodePudding user response:

If you want it to be a bit more fool proof you can use look behinds and look aheads. Expand the extension list png|bmp|jpg to test for more extensions.

/(?<=src=").*(png|bmp|jpg)(?=")/g

regex101

CodePudding user response:

Try this src="(.*image.*)"

  • Related