Home > Blockchain >  Path validation - Trying to modify my RegEx so that it only matches paths that include a filename wi
Path validation - Trying to modify my RegEx so that it only matches paths that include a filename wi

Time:11-13

This is what I'm working with: https://regex101.com/r/BertHu/3/

^(?:(?:[a-z]:|\\\\[a-z0-9_.$●-] \\[a-z0-9_.$●-] )\\|\\?[^\\\/:*?"<>|\r\n] \\?)*(?:[^\\\/:*?"<>|\r\n] \\)*[^\\\/:*?"<>|\r\n]*$

The regular expression I'm using is based on this implementation from Oreilly.

Here's a breakdown (I had to fix some un-escaped characters from Oreilly's expression):

(?:(?:[a-z]:|\\\\[a-z0-9_.$\●-] \\[a-z0-9_.$\●-] )\\|  # Drive
\\?[^\\\/:*?"<>|\r\n] \\?)                             # Relative path
(?:[^\\\/:*?"<>|\r\n] \\)*                             # Folder
[^\\\/:*?"<>|\r\n]*                                    # File

I'm implementing this in PowerShell, and the expression will be case-insensitive.

I want to modify this expression such that it only matches paths that contain a file with an extension. I am aware that it's possible for a file to not contain an extension - I don't want to match this edge case.

Examples of what I'd like to happen:

C:\Applications\Dev\File.txt Match

C:\Applications\Dev\ Does not match

\\192.168.0.1\SHARE\my folder\test.exe Match

..\..\bin\my_executable.exe Match

Etc.

If someone can point me to a solution, that would be of great help!

Thanks much.

CodePudding user response:

A pragmatic solution is to apply your validating regex first and - if a path matches - call the System.IO.Path.GetExtension() .NET API method on it:[1]

  • Note: I haven't looked at the specifics, but your regex also matches malformed paths such as C:\foo\C:\bar - follow-up question.
'C:\Applications\Dev\File.txt',
'C:\Applications\Dev\',
'\\192.168.0.1\SHARE\my folder\test.exe',
'..\..\bin\my_executable.exe',
'invalid:path' | 
  ForEach-Object {
    $valid = $_ -match '^(?:(?:[a-z]:|\\\\[a-z0-9_.$●-] \\[a-z0-9_.$●-] )\\|\\?[^\\\/:*?"<>|\r\n] \\?)*(?:[^\\\/:*?"<>|\r\n] \\)*[^\\\/:*?"<>|\r\n]*$'
    [pscustomobject] @{
      Path = $_
      Valid = $valid
      HasExtension = if ($valid) { '' -ne [IO.Path]::GetExtension($_) }
    }
  }

Output:

Path                                   Valid HasExtension
----                                   ----- ------------
C:\Applications\Dev\File.txt            True         True
C:\Applications\Dev\                    True        False
\\192.168.0.1\SHARE\my folder\test.exe  True         True
..\..\bin\my_executable.exe             True         True
invalid:path                           False             

[1] On Windows, this method itself performs limited validation: paths with illegal characters such as " cause an exception, but not malformed ones. On Unix-like platforms, where the file systems typically allow any character in paths except NUL, no validation appears to be performed at all (even NUL characters don't cause an exception).

  • Related