I am trying to find a regex that will be able to treat 3 cases that i have in html.
For instance,
- background-image: url(https://www.aaa.com/xxx/picture.png); background-size: 100%...
- background-image: url("https://www.aaa.com/xxx/picture(1_2)_pic(23).png")
- background-image: url(https://www.aaa.com/xxx/picture)
In this 3 cases , i want to extract the 3 urls:
- https://www.aaa.com/xxx/picture.png
- https://www.aaa.com/xxx/picture(1_2)_pic(23).png
- https://www.aaa.com/xxx/picture
I created a regex that is able to extract only the first two. Unfortunately, cannot think of the regex that will be able to get the third case(url path without extension) also inside that one regex.
what i came up with so far:
/background(?:-image|):[\s]*url[(][\"']?(https?:\/\/[^’\”] [.](?:gif|png))[‘\”]?/g
Appreciate all the help.
CodePudding user response:
This regex pattern treats the nested parentheses as an optional repeated group between the parentheses of the url()
.
background-image:\s*url\(['"]?((?:\S*?\(\S*?\))*\S*?)['"]?\)
Javascript snippet:
const text = `background-image: url(https://www.aaa.com/xxx/picture.png); background-size: 100%...
background-image: url("https://www.aaa.com/xxx/picture(1_2)_pic(23).png")
background-image: url(https://www.aaa.com/xxx/picture)
`;
let re = /background-image:\s*url\(['"]?((?:\S*?\(\S*?\))*\S*?)['"]?\)/g;
let arr = [...text.matchAll(re)].map(x=>x[1]);
console.log(arr);
CodePudding user response:
You could use
background-image:\s*url\((['"]?)(\S*?)\1\)
background-image:\s*
Match literally followed by optional whitespace charsurl\(
Matchurl(
(['"]?)
Capture group 1, optionally match either'
or"
(\S*?)
Capture group 2, match optional non whitespace characters as least as possible\1
Backreference to match the same that is in capture group 1\)
Match)
The url is in capture group 2. An example using JavaScript:
const regex = /background-image:\s*url\((['"]?)(\S*?)\1\)/g;
const str = `background-image: url(https://www.aaa.com/xxx/picture.png); background-size: 100%...
background-image: url("https://www.aaa.com/xxx/picture(1_2)_pic(23).png")
background-image: url(https://www.aaa.com/xxx/picture)`;
console.log(Array.from(str.matchAll(regex), m => m[2]));