Home > Software design >  Regex trying to get just package name from `az.accounts.2.10.4.nupkg`
Regex trying to get just package name from `az.accounts.2.10.4.nupkg`

Time:12-16

I am trying to get the package name from the file name using C# and Regex. This is my attempt so far which works, but I am wondering if is there a more elegant way.

Given for example, az.accounts.2.10.4.nupkg I want to get az.accounts

My attempt:

var filename = Path.GetFileNameWithoutExtension(nupkgPackagePath);
    
var nupkgPackageGetModulePath = Regex.Matches(filename, @"[^\d] ").First().Value.TrimEnd('.'));

Test cases:

$ ls *.nupkg
PowerShellGet.nupkg                   az.iothub.2.7.4.nupkg
az.9.2.0.nupkg                        az.keyvault.4.9.1.nupkg
az.accounts.2.10.4.nupkg              az.kusto.2.1.0.nupkg
az.advisor.2.0.0.nupkg                az.logicapp.1.5.0.nupkg
az.aks.5.1.0.nupkg                    az.machinelearning.1.1.3.nupkg
az.analysisservices.1.1.4.nupkg       az.maintenance.1.2.1.nupkg
az.apimanagement.4.0.1.nupkg          az.managedserviceidentity.1.1.0.nupkg
az.appconfiguration.1.2.0.nupkg       az.managedservices.3.0.0.nupkg
az.applicationinsights.2.2.0.nupkg    az.marketplaceordering.2.0.0.nupkg
az.attestation.2.0.0.nupkg            az.media.1.1.1.nupkg
az.automation.1.8.0.nupkg             az.migrate.2.1.0.nupkg
az.batch.3.2.1.nupkg                  az.monitor.4.3.0.nupkg
az.billing.2.0.0.nupkg                az.mysql.1.1.0.nupkg
az.cdn.2.1.0.nupkg                    az.network.5.2.0.nupkg
az.cloudservice.1.1.0.nupkg           az.notificationhubs.1.1.1.nupkg
az.cognitiveservices.1.12.0.nupkg     az.operationalinsights.3.2.0.nupkg
az.compute.5.2.0.nupkg                az.policyinsights.1.5.1.nupkg
az.confidentialledger.1.0.0.nupkg     az.postgresql.1.1.0.nupkg
az.containerinstance.3.1.0.nupkg      az.powerbiembedded.1.2.0.nupkg
az.containerregistry.3.0.0.nupkg      az.privatedns.1.0.3.nupkg
az.cosmosdb.1.9.0.nupkg               az.recoveryservices.6.1.2.nupkg
az.databoxedge.1.1.0.nupkg            az.rediscache.1.6.0.nupkg
az.databricks.1.4.0.nupkg             az.redisenterprisecache.1.1.0.nupkg
az.datafactory.1.16.11.nupkg          az.relay.1.0.3.nupkg
az.datalakeanalytics.1.0.2.nupkg      az.resourcemover.1.1.0.nupkg
az.datalakestore.1.3.0.nupkg          az.resources.6.5.0.nupkg
az.dataprotection.1.0.1.nupkg         az.security.1.3.0.nupkg
az.datashare.1.0.1.nupkg              az.securityinsights.3.0.0.nupkg
az.deploymentmanager.1.1.0.nupkg      az.servicebus.2.1.0.nupkg
az.desktopvirtualization.3.1.1.nupkg  az.servicefabric.3.1.0.nupkg
az.devtestlabs.1.0.2.nupkg            az.signalr.1.5.0.nupkg
az.dns.1.1.2.nupkg                    az.sql.4.1.0.nupkg
az.eventgrid.1.5.0.nupkg              az.sqlvirtualmachine.1.1.0.nupkg
az.eventhub.3.2.0.nupkg               az.stackhci.1.4.0.nupkg
az.frontdoor.1.9.0.nupkg              az.storage.5.2.0.nupkg
az.functions.4.0.6.nupkg              az.storagesync.1.7.0.nupkg
az.hdinsight.5.0.1.nupkg              az.streamanalytics.2.0.0.nupkg
az.healthcareapis.2.0.0.nupkg         az.support.1.0.0.nupkg

CodePudding user response:

You can try something like this:

      string text = "az.streamanalytics.2.0.0.nupkg";

      var result = Regex
        .Match(text, @"(?<name>[a-zA-Z0-9.] ?)(\.[0-9] )*\.nupkg$")
        .Groups["name"]
        .Value;

Pattern explained:

(?<name>[a-zA-Z0-9.] ?) - letters, digits, dots as few as possible
                          (in order do not match version part)
(\.[0-9] )*             - zero or more version part: . followed by digits 
\.nupkg                 - .nupkg
$                       - end of string

Fiddle

CodePudding user response:

^[^.]*\.[^.]*

You can test it out at https://regex101.com/

using System.Text.RegularExpressions;

// ...

string filename = "az.accounts.2.10.4.nupkg";

string pattern = @"^[^.]*\.[^.]*";

string nupkgPackageGetModulePath = Regex.Match(filename, pattern).Value;

// nupkgPackageGetModulePath is now "az.accounts"

CodePudding user response:

You've got two different input formats

<PackageName>.nupkg
<PackageName>.<Major>.<Minor>.<Patch>.nupkg

Your current attempt:

Regex.Matches(fileName, @"[^\d] ").First().Value.TrimEnd('.')

This actually doesn't work for an input of "PowerShellGet.nupkg". To explain how this code works.

  1. Starting at the beginning of the string, find the first non-digit character, and greedily include all other consecutive non-digit characters. This is the "matched text"
  2. If the matched text ends with a period, take off that period.

This works fine if your input has a number in it, but "PowerShellGet.nupkg" doesn't, hence nupkgPackageGetModulePath in your code example will be the full file name not "PowerShellGet".

This will also be a huge problem if the package name itself contains a digit. How about "runtime.opensuse.13.2-x64.runtime.native.System.Security.Cryptography.OpenSsl.4.3.3.nupkg", or (and I can't believe this is actually a package) "2.2.0.0.nupgk".

It's not a good idea to find the first non-digit. Instead, work with the expected format of nuget packages.

Using string.Split:

Split the input by periods. If there's two elements in the resulting array, it's the first format and return the first element of the array. If there's at least 5 elements in the array, it's the second format. Otherwise, the format is unknown.

private static string GetPackageName(string packageFileName)
{
    var segments = packageFileName.Split('.');

    return segments.Length switch
    {
        2 => segments[0],
        >= 5 => string.Join(".", segments[..^4]),
        _ => throw new Exception("Unknown what you want done here")
    };
}

segments[..^4] is a handy way to get all the element(s) before the major version.

https://dotnetfiddle.net/Ok6jbq

Using Regex:

Again, because you've got two different formats you've got to account for both so this gets a bit more complicated.

([\S] ?)(?:\.\d \.\d \.\d )?\.nupkg

The middle section ((?:\.\d \.\d \.\d )?) is a non-capture group (starts with ?:) which is optional (suffixed with ?).

Capture group 1 will have the package name.

https://regexr.com/74mgf

  • Related