Home > Blockchain >  Finding a regex to match all the cases in background image url
Finding a regex to match all the cases in background image url

Time:03-07

I am trying to find a regex that will be able to treat 3 cases that i have in html.

For instance,

  1. background-image: url(https://www.aaa.com/xxx/picture.png); background-size: 100%...
  2. background-image: url("https://www.aaa.com/xxx/picture(1_2)_pic(23).png")
  3. background-image: url(https://www.aaa.com/xxx/picture)

In this 3 cases , i want to extract the 3 urls:

  1. https://www.aaa.com/xxx/picture.png
  2. https://www.aaa.com/xxx/picture(1_2)_pic(23).png
  3. https://www.aaa.com/xxx/picture

I created a regex that is able to extract only the first two. Unfortunately, cannot think of the regex that will be able to get the third case(url path without extension) also inside that one regex.

what i came up with so far:

/background(?:-image|):[\s]*url[(][\"']?(https?:\/\/[^’\”] [.](?:gif|png))[‘\”]?/g

Appreciate all the help.

CodePudding user response:

This regex pattern treats the nested parentheses as an optional repeated group between the parentheses of the url().

background-image:\s*url\(['"]?((?:\S*?\(\S*?\))*\S*?)['"]?\)

Javascript snippet:

const text = `background-image: url(https://www.aaa.com/xxx/picture.png); background-size: 100%...
background-image: url("https://www.aaa.com/xxx/picture(1_2)_pic(23).png")
background-image: url(https://www.aaa.com/xxx/picture)
`;
let re = /background-image:\s*url\(['"]?((?:\S*?\(\S*?\))*\S*?)['"]?\)/g;
let arr = [...text.matchAll(re)].map(x=>x[1]);
console.log(arr);

CodePudding user response:

You could use

background-image:\s*url\((['"]?)(\S*?)\1\)
  • background-image:\s* Match literally followed by optional whitespace chars
  • url\( Match url(
  • (['"]?) Capture group 1, optionally match either ' or "
  • (\S*?) Capture group 2, match optional non whitespace characters as least as possible
  • \1 Backreference to match the same that is in capture group 1
  • \) Match )

Regex demo

The url is in capture group 2. An example using JavaScript:

const regex = /background-image:\s*url\((['"]?)(\S*?)\1\)/g;
const str = `background-image: url(https://www.aaa.com/xxx/picture.png); background-size: 100%...
background-image: url("https://www.aaa.com/xxx/picture(1_2)_pic(23).png")
background-image: url(https://www.aaa.com/xxx/picture)`;
console.log(Array.from(str.matchAll(regex), m => m[2]));

  • Related