Home > Mobile >  Azure blob storage regex pattern for user input validation
Azure blob storage regex pattern for user input validation

Time:04-29

I’ve developed a function in PowerShell (.NET Framework) to retrieve data from any given Azure Blob Storage: so far, so good.

When it comes down to validate users’ input, unfortunately, I cannot rely on external modules or libraries such as the NameValidator Class of Azure SDK for .NET.

Nevertheless, the article Naming and Referencing Containers, Blobs, and Metadata goes into the details of naming rules and thus regex patterns might come to the rescue.

For Container Names I’ve came up with this, and it seems to fit:

(?=^.{3,63}$)(?!.*--)[^-][a-z0-9-]*[^-]

Container Names

A container name must be a valid DNS name, conforming to the following naming rules:

  • Container names must start or end with a letter or number, and can contain only letters, numbers, and the dash (-) character.
  • Every dash (-) character must be immediately preceded and followed by a letter or number; consecutive dashes are not permitted in container names.
  • All letters in a container name must be lowercase.
  • Container names must be from 3 through 63 characters long.

For Blob Names however I’m not able to get around the counting of path segments:

(?=^.{1,1024}$)(?<=^|\/)(\S*?)[^\.] (?=\/|$)

NB: the Azure Storage emulator has been deprecated and therefor out of scope.

Blob Names

A blob name must conforming to the following naming rules:

  • A blob name can contain any combination of characters.
  • A blob name must be at least one character long and cannot be more than 1,024 characters long, for blobs in Azure Storage.
  • The Azure Storage emulator supports blob names up to 256 characters long. For more information, see Use the Azure storage emulator for development and testing.
  • Blob names are case-sensitive.
  • Reserved URL characters must be properly escaped.
  • The number of path segments comprising the blob name cannot exceed 254. A path segment is the string between consecutive delimiter characters (e.g., the forward slash '/') that corresponds to the name of a virtual directory.
  • Note Avoid blob names that end with a dot (.), a forward slash (/), or a sequence or combination of the two. No path segments should end with a dot (.).

The Blob service is based on a flat storage scheme, not a hierarchical scheme. However, you may specify a character or string delimiter within a blob name to create a virtual hierarchy. For example, the following list shows valid and unique blob names. Notice that a string can be valid as both a blob name and as a virtual directory name in the same container:

/a
/a.txt
/a/b
/a/b.txt

You can take advantage of the delimiter character when enumerating blobs.

NB: Just before asking this question, I’ve found this ones that answer what I’ve already solved on my own or use the aforementioned class:

Azure Container Name RegEx

How to validate Azure storage blob names

By the way, does anybody know which flavor of regex is used by PowerShell?

CodePudding user response:

You need to use

^(?!.{1025})/?[^/]*[^/.](?:/[^/]*[^/.]){0,253}$
^(?=.{1,1024}$)/?[^/]*[^/.](?:/[^/]*[^/.]){0,253}$

See the regex demo.

Details:

  • ^ - start of string
  • (?=.{1,1024}$) - the string should contain from 1 to 1024 chars
  • (?!.{1025}) - the string cannot contain more than 1025 chars
  • /? - an optional /
  • [^/]*[^/.] - zero or more chars other than / and then a char other than / and .
  • (?:/[^/]*[^/.]){0,253} - zero to 253 occurrences of / followed by zero or more chars other than / and then a char other than / and .
  • $ - end of string.
  • Related