Home > Blockchain >  i want to find a specific peiece of data from a string using python regular expresion and exclude sp
i want to find a specific peiece of data from a string using python regular expresion and exclude sp

Time:10-06

((http(s?):)./([a-z]).*/) this is regular expresion is try

but in this string i want directory like this:/wp-content/uploads/2021/09/ and the image name like this : VideoHive-Happy-Kids-Slideshow-Premiere-Pro-MOGRT-Free-Download-GetintoPC.com_-300x169.jpg

please help me out

string is here https://getintopc.com/wp-content/uploads/2021/09/VideoHive-Happy-Kids-Slideshow-Premiere-Pro-MOGRT-Free-Download-GetintoPC.com_-300x169.jpg https://getintopc.com/wp-content/uploads/2021/09/VideoHive-Happy-Kids-Slideshow-Premiere-Pro-MOGRT-Direct-Link-Free-Download-GetintoPC.com_-300x169.jpg https://getintopc.com/wp-content/uploads/2021/09/VideoHive-Happy-Kids-Slideshow-Premiere-Pro-MOGRT-Full-Offline-Installer-Free-Download-GetintoPC.com_-300x169.jpg https://getintopc.com/wp-content/uploads/2021/09/VideoHive-Happy-Kids-Slideshow-Premiere-Pro-MOGRT-Latest-Version-Free-Download-GetintoPC.com_-300x169.jpg

CodePudding user response:

You can try this one:

https?:\/\/.*?(?<folder>\/.*?\/.*?\/.*?\/.*?\/)(?<image>.*)

I have added two capturing groups, by the names of folder and image.

CodePudding user response:

You can use 2 capture groups

https?:\/\/[^/]*(\/wp-content\/uploads\/\d{4}\/\d{2}\/)([^\/\s] )
  • https?:\/\/[^/]* Match the protocol to before the first /
  • ( Capture group 1
    • \/wp-content\/uploads\/\d{4}\/\d{2}\/ Match /wp-content/uploads/ 4 digits / 2 digits /
  • ) Close group 1
  • ([^\/\s] ) Capture group 2, match 1 time any char except / or a whitespace char

Regex demo

const s = `https://getintopc.com/wp-content/uploads/2021/09/VideoHive-Happy-Kids-Slideshow-Premiere-Pro-MOGRT-Free-Download-GetintoPC.com_-300x169.jpg https://getintopc.com/wp-content/uploads/2021/09/VideoHive-Happy-Kids-Slideshow-Premiere-Pro-MOGRT-Direct-Link-Free-Download-GetintoPC.com_-300x169.jpg https://getintopc.com/wp-content/uploads/2021/09/VideoHive-Happy-Kids-Slideshow-Premiere-Pro-MOGRT-Full-Offline-Installer-Free-Download-GetintoPC.com_-300x169.jpg https://getintopc.com/wp-content/uploads/2021/09/VideoHive-Happy-Kids-Slideshow-Premiere-Pro-MOGRT-Latest-Version-Free-Download-GetintoPC.com_-300x169.jpg`;
const regex = /https?:\/\/[^/]*(\/wp-content\/uploads\/\d{4}\/\d{2}\/)([^\/\s] )/g;
const res = Array.from(s.matchAll(regex), m => [m[1], m[2]]);
console.log(res);

Or a bit broader version, first matching folders starting with [a-z] followed by folders starting with a digit and ending the last part on for example .jpg

https?:\/\/[^/]*((?:\/[a-z][^/]*) (?:\/\d ) \/)([^\/] \.jpg)

Regex demo

  • Related