Home > Blockchain >  Regex to capture 2 sets of 8 non capital letters, sandwiched by dots
Regex to capture 2 sets of 8 non capital letters, sandwiched by dots

Time:10-18

I'm hoping to capture some rando weirdness that is being dynamically generated and stuffed at the end of my temporary ASP files. The weirdness is a pattern of two sets of eight characters (appear to be all non capital letters) sandwiched by dots.

I am trying to find the best regex to capture both sets of 8 non capital letter characters, sandwiched by dots.

Here is my current regex:

\.([^A-Z]{8})\.

My current regex works ok to capture the first set, but doesn't capture the second set. I believe it's because the dot is getting eaten after the first match and so there's no dot to trigger the second set from matching.

How can I improve this regex so it captures both sets of dynamic weirdness? Would greatly appreciate any help folks can provide!

Data set to match:

String Expected match
\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\svc_pr30\701d8ff1\10cc0653\App_Web_defaultwsdlhelpgenerator.aspx.cdcab7d2.3sl-aaqs.dll cdcab7d2.3sl-aaqs
\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\svc_pr21\a201b637\20c58f14\App_Web_defaultwsdlhelpgenerator.aspx.cdcab7d2.xqj2w-wv.dll cdcab7d2.xqj2w-wv
\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\web_releaseapi\638ee986\2f0d9ef4\App_Web_defaultwsdlhelpgenerator.aspx.cdcab7d2.-qsn3y9x.dll cdcab7d2.-qsn3y9x
\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\web_releaseapi\638ee986\2f0d9ef4\App_Web_defaultwsdlhelpgenerator.aspx.cdcab7d2.pyn4enbe.dll cdcab7d2.pyn4enbe
\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\cmuserservice_windowsauth\a10d69fc\d9424d7d\App_Web_defaultwsdlhelpgenerator.aspx.cdcab7d2.thhlx9xi.dll cdcab7d2.thhlx9xi

CodePudding user response:

Instead of focusing on the non capital, your dataset actually extracts: all small letters, numbers, and dash. Separated by a dot
And you want to extract just before the .dll
So you can use this regex to extract.

([a-z0-9-] ?\.[-a-z0-9] ?)\.dll

Then for your result, simply get group 1 of the regex matches. I presume you know about regex grouping.

See the demo here

CodePudding user response:

You can use

\.([^A-Z.]{8}\.[^A-Z.]{8})\.

See the regex demo. Details:

  • \. - a dot
  • ([^A-Z.]{8}\.[^A-Z.]{8}) - Group 1: eight chars other than a dot and uppercase ASCII letters, a . and then again eight chars other than a dot and uppercase ASCII letters
  • \. - a dot.

Group 1 values for each tested string will be:

cdcab7d2.3sl-aaqs
cdcab7d2.xqj2w-wv
cdcab7d2.-qsn3y9x
cdcab7d2.pyn4enbe
cdcab7d2.thhlx9xi
  • Related