Home > OS >  LINQ Query Not Selecting Files
LINQ Query Not Selecting Files

Time:03-23

I am trying to LINQ query a set of files where I can find the file names with a specific string in them.

I was using:

var docs = directory.enumerateFiles(searchFolder, "* "   strNumber  "*", SearchOption.AllDirectories);

That was working fine, but some of my file searches were taking 30 minutes due to the fact that one of the directories has 1 million files. I was hoping to speed up the search process with a PLINQ query. However, while my syntax is good, I'm not getting the results I would expect. It looks like my problem may be in the Where statement. Any help would be helpful.

foreach (strNumber in strNumbers) 
{
    DirectoryInfo searchDirectory = new DirectoryInfo(searchFolder);
    IEnumerable<System.IO.FileInfo> allDocs = searchDirectory.EnumerateFiles("*", SearchOPtion.AllDirectories);

    IEnumerable<System.IO.FileInfo> docsToProcess = strNumbers
        .SelectMany(strNumber => allDocs
        .Where(file => file.Name.Contains(strNumber)))
        .Distinct();
}

Any help would be much appreciated.

CodePudding user response:

I would change the order of the problem.

  1. Create a list of all files (into memory)
  2. Perform the search over the memory list

Then, you can use a Parallel Foreach over the memory array and your disk usage is limited to the initial search.

var searchDirectory = new DirectoryInfo(searchFolder);
var allDocs = searchDirectory.EnumerateFiles("*", SearchOPtion.AllDirectories).ToArray();

// For extra points, use a Parallel.ForEach here for multi-threaded work
Parallel.Foreach(strNumbers, strNumber => 
{
   // Work on allDocs here, it should be in memory
});
  • Related