I am trying to get the package name from the file name using C# and Regex. This is my attempt so far which works, but I am wondering if is there a more elegant way.
Given for example, az.accounts.2.10.4.nupkg
I want to get az.accounts
My attempt:
var filename = Path.GetFileNameWithoutExtension(nupkgPackagePath);
var nupkgPackageGetModulePath = Regex.Matches(filename, @"[^\d] ").First().Value.TrimEnd('.'));
Test cases:
$ ls *.nupkg
PowerShellGet.nupkg az.iothub.2.7.4.nupkg
az.9.2.0.nupkg az.keyvault.4.9.1.nupkg
az.accounts.2.10.4.nupkg az.kusto.2.1.0.nupkg
az.advisor.2.0.0.nupkg az.logicapp.1.5.0.nupkg
az.aks.5.1.0.nupkg az.machinelearning.1.1.3.nupkg
az.analysisservices.1.1.4.nupkg az.maintenance.1.2.1.nupkg
az.apimanagement.4.0.1.nupkg az.managedserviceidentity.1.1.0.nupkg
az.appconfiguration.1.2.0.nupkg az.managedservices.3.0.0.nupkg
az.applicationinsights.2.2.0.nupkg az.marketplaceordering.2.0.0.nupkg
az.attestation.2.0.0.nupkg az.media.1.1.1.nupkg
az.automation.1.8.0.nupkg az.migrate.2.1.0.nupkg
az.batch.3.2.1.nupkg az.monitor.4.3.0.nupkg
az.billing.2.0.0.nupkg az.mysql.1.1.0.nupkg
az.cdn.2.1.0.nupkg az.network.5.2.0.nupkg
az.cloudservice.1.1.0.nupkg az.notificationhubs.1.1.1.nupkg
az.cognitiveservices.1.12.0.nupkg az.operationalinsights.3.2.0.nupkg
az.compute.5.2.0.nupkg az.policyinsights.1.5.1.nupkg
az.confidentialledger.1.0.0.nupkg az.postgresql.1.1.0.nupkg
az.containerinstance.3.1.0.nupkg az.powerbiembedded.1.2.0.nupkg
az.containerregistry.3.0.0.nupkg az.privatedns.1.0.3.nupkg
az.cosmosdb.1.9.0.nupkg az.recoveryservices.6.1.2.nupkg
az.databoxedge.1.1.0.nupkg az.rediscache.1.6.0.nupkg
az.databricks.1.4.0.nupkg az.redisenterprisecache.1.1.0.nupkg
az.datafactory.1.16.11.nupkg az.relay.1.0.3.nupkg
az.datalakeanalytics.1.0.2.nupkg az.resourcemover.1.1.0.nupkg
az.datalakestore.1.3.0.nupkg az.resources.6.5.0.nupkg
az.dataprotection.1.0.1.nupkg az.security.1.3.0.nupkg
az.datashare.1.0.1.nupkg az.securityinsights.3.0.0.nupkg
az.deploymentmanager.1.1.0.nupkg az.servicebus.2.1.0.nupkg
az.desktopvirtualization.3.1.1.nupkg az.servicefabric.3.1.0.nupkg
az.devtestlabs.1.0.2.nupkg az.signalr.1.5.0.nupkg
az.dns.1.1.2.nupkg az.sql.4.1.0.nupkg
az.eventgrid.1.5.0.nupkg az.sqlvirtualmachine.1.1.0.nupkg
az.eventhub.3.2.0.nupkg az.stackhci.1.4.0.nupkg
az.frontdoor.1.9.0.nupkg az.storage.5.2.0.nupkg
az.functions.4.0.6.nupkg az.storagesync.1.7.0.nupkg
az.hdinsight.5.0.1.nupkg az.streamanalytics.2.0.0.nupkg
az.healthcareapis.2.0.0.nupkg az.support.1.0.0.nupkg
CodePudding user response:
You can try something like this:
string text = "az.streamanalytics.2.0.0.nupkg";
var result = Regex
.Match(text, @"(?<name>[a-zA-Z0-9.] ?)(\.[0-9] )*\.nupkg$")
.Groups["name"]
.Value;
Pattern explained:
(?<name>[a-zA-Z0-9.] ?) - letters, digits, dots as few as possible
(in order do not match version part)
(\.[0-9] )* - zero or more version part: . followed by digits
\.nupkg - .nupkg
$ - end of string
CodePudding user response:
^[^.]*\.[^.]*
You can test it out at https://regex101.com/
using System.Text.RegularExpressions;
// ...
string filename = "az.accounts.2.10.4.nupkg";
string pattern = @"^[^.]*\.[^.]*";
string nupkgPackageGetModulePath = Regex.Match(filename, pattern).Value;
// nupkgPackageGetModulePath is now "az.accounts"
CodePudding user response:
You've got two different input formats
<PackageName>.nupkg
<PackageName>.<Major>.<Minor>.<Patch>.nupkg
Your current attempt:
Regex.Matches(fileName, @"[^\d] ").First().Value.TrimEnd('.')
This actually doesn't work for an input of "PowerShellGet.nupkg". To explain how this code works.
- Starting at the beginning of the string, find the first non-digit character, and greedily include all other consecutive non-digit characters. This is the "matched text"
- If the matched text ends with a period, take off that period.
This works fine if your input has a number in it, but "PowerShellGet.nupkg" doesn't, hence nupkgPackageGetModulePath
in your code example will be the full file name not "PowerShellGet".
This will also be a huge problem if the package name itself contains a digit. How about "runtime.opensuse.13.2-x64.runtime.native.System.Security.Cryptography.OpenSsl.4.3.3.nupkg", or (and I can't believe this is actually a package) "2.2.0.0.nupgk".
It's not a good idea to find the first non-digit. Instead, work with the expected format of nuget packages.
Using string.Split
:
Split the input by periods. If there's two elements in the resulting array, it's the first format and return the first element of the array. If there's at least 5 elements in the array, it's the second format. Otherwise, the format is unknown.
private static string GetPackageName(string packageFileName)
{
var segments = packageFileName.Split('.');
return segments.Length switch
{
2 => segments[0],
>= 5 => string.Join(".", segments[..^4]),
_ => throw new Exception("Unknown what you want done here")
};
}
segments[..^4]
is a handy way to get all the element(s) before the major version.
https://dotnetfiddle.net/Ok6jbq
Using Regex:
Again, because you've got two different formats you've got to account for both so this gets a bit more complicated.
([\S] ?)(?:\.\d \.\d \.\d )?\.nupkg
The middle section ((?:\.\d \.\d \.\d )?
) is a non-capture group (starts with ?:
) which is optional (suffixed with ?
).
Capture group 1 will have the package name.