Home > OS >  Regex to capture a part of string that switches between numbers and digits at least twice
Regex to capture a part of string that switches between numbers and digits at least twice

Time:10-04

I've been stumbling around for a couple days on a regex problem I'm hoping someone can help resolve. My goal is to create Regex to capture a line where some string switches between letter characters and digits at least twice (but ignoring the filename/extension), to find my "weird dynamically generated files". Right now, my current regex captures number/letter changes but even single changes. It would be great to only capture if there's been multiple number/letter changes, since single changes tend to be purposeful(ex. End2EndTest file). If someone has pointers how to change this regex to capture pattern only when it's repeating let's say three times for a string, would appreciate it!

Current regex:

```(:[A-Za-z][A-Za-z\d-_]*\d[A-Za-z\d-_].*?\\|[\d][A-Za-z\d-_]*[A-Za-z][A-Za-z\d-_].*? 
   \\)```

 

Data set:

\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\123xyz123xyz\42abc43abc\App_global.asax.a1b23cd.dll
\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\123xyz123xyz\ab12cd45\App_global.asax.a2cd123.dll
\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\1b6123f0\ab12cd34\App_global.asax.kkp9w6zm.dll
\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\42abc43abc\539445c9\App_global.asax.-1bnvx3f.dll
\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\ab12cd34\eb88e642\App_global.asax.fswscrcw.dll
\scope\ScopeWorkingDir\script_7D16668D9F697A13\__ScopeCodeGenEngine__.dll
\scope\ScopeWorkingDir\script_7D16668D9F697A13\__ScopeCodeGen__.dll
\scope\ScopeWorkingDir\script_7D16668D9F697A13\__ScopeCodeGenEngine__.dll
\scope\ScopeWorkingDir\script_7D16668D9F697A13\__ScopeCodeGen__.dll
\\bt\\RANDOM\\repo\\out\\retail-amd64\\End2EndTest\\End2EndTest.exe
\\bt\\RANDOM\\repo\\out\\retail-amd64\\HighFive3\\DiskVfy512.exe

CodePudding user response:

You can an alternation | to match either requirement, starting either with a digit, or with a char A-Za-z

If you want to allow more characters in between, you can extend the character class with the allowed chars like [A-Za-z\d-_]

Note to put the - at the end or escape it \-

\d[A-Za-z] \d [A-Za-z]|[A-Za-z]\d [A-Za-z] \d

Regex demo

If you want to match the whole line:

^.*(?:\d[A-Za-z] \d [A-Za-z]|[A-Za-z]\d [A-Za-z] \d).*

Regex demo

  • Related