I have a huge directory I need retrieve files from including subdirectories.
I have files that are folders contain various files but I am only interested in specific proprietary files named with an extension with a length of 7 digits.
For example, I have folder that contains the following files:
abc.txt
def.txt
GIWFJ1XA.0201000
GIWFJ1UC.0501000
NOOBO0XA.0100100
summary.pdf
someinfo.zip
T7F4JUXA.0300600
vxy98796.csv
YJHLPLBO.0302300
YJHLPLUC.0302800
I have tried the following:
var fileList = Directory.GetFiles(someDir, "*.???????", SearchOption.AllDirectories)
and also
string searchSting = string.Empty;
for (int j = 0; j < 9999999; j )
{
searchSting = string.Format(", *.{0} ", j.ToString("0000000"));
}
var fileList2 = Directory.GetFiles(someDir, searchSting, SearchOption.AllDirectories);
which errors because the string is too long obviously.
I want to only return the files with the specified length of the extension, in this case, 7 digits to avoid having to loop over the thousands I would have to process.
I have considered creating a variable string for the search criteria that would contain all 99,999,999 possible digits but d
How can I accomplish this?
CodePudding user response:
I don't believe there's a way you can do this without looping through the files in the directory and its subfolders. The search pattern for GetFiles
doesn't support regular expressions, so we can't really use something like [\d]{7}
as a filter. I would suggest using Directory.EnumerateFiles
and then return the files that match your criteria.
You can use this to enumerate the files:
private static IEnumerable<string> GetProprietaryFiles(string topDirectory)
{
Func<string, bool> filter = f =>
{
string extension = Path.GetExtension(f);
// is 8 characters long including the .
// all remaining characters are digits
return extension.Length == 8 && extension.Skip(1).All(char.IsDigit);
};
// EnumerateFiles allows us to step through the files without
// loading all of the filenames into memory at once.
IEnumerable<string> matchingFiles =
Directory.EnumerateFiles(topDirectory, "*", SearchOption.AllDirectories)
.Where(filter);
// Return each file as the enumerable is iterated
foreach (var file in matchingFiles)
{
yield return file;
}
}
Path.GetExtension
includes the .
so we check that the number of characters including the .
is 8, and that all remaining characters are digits.
Usage:
List<string> fileList = GetProprietaryFiles(someDir).ToList();
CodePudding user response:
I would just grab the list of files in the directory, and then check if the substring length after the '.' is equal to 7. (* As long as you know no other files would have that length extension)
EDITED to use Path instead:
Directory.GetFiles(@"C:\temp").Where(
fileName => Path.GetExtension(fileName).Length == 8
).ToList();
OLD:
Directory.GetFiles(someDir).Where(
fileName => fileName.Substring(fileName.LastIndexOf('.') 1).Length == 7
).ToList();