I am trying to collect a list of all files and directories recursively and I don't want the UI thread to lock up but it is.
Makes no sense whatsoever. If I am running the task in a new thread and nothing from the old thread is needed then why is the UI locking up.
private async void whatfolder(object sender, RoutedEventArgs e)
{
IEnumerable<string> myFiles = Enumerable.Empty<string>();
myFiles= await Task.Run(() =>
{
return SafeWalk
.EnumerateFiles(path, "*.*", so)
.Where(s => ext.Contains(System.IO.Path.GetExtension(s).TrimStart('.').ToLowerInvariant()));
});
}
Do some stuff with myFiles
. I set breakpoints. It locks up awaiting.
public static class SafeWalk
{
public static IEnumerable<string> EnumerateFiles(string path, string searchPattern,
SearchOption searchOpt)
{
try
{
var dirFiles = Enumerable.Empty<string>();
if (searchOpt == SearchOption.AllDirectories)
{
dirFiles = System.IO.Directory.EnumerateDirectories(path)
.SelectMany(x => EnumerateFiles(x, searchPattern, searchOpt));
}
return dirFiles.Concat(System.IO.Directory.EnumerateFiles(path, searchPattern));
}
catch (UnauthorizedAccessException ex)
{
return Enumerable.Empty<string>();
}
}
}
CodePudding user response:
First of all, if you need to frequently search a large number of files, you should seriously consider using the Windows Search Service in an indexed location, or a search engine like Elastic. Searching a prebuilt index is always faster than trying to crawl a folder from scratch
There's nothing that blocks or even writes to the UI in the question's code. There are serious inefficiencies though that could end up wasting all available CPU time in garbage collection. For example, strings are immutable so any string modification returns a new temporary string. .GetExtension(s).TrimStart('.').ToLowerInvariant()
creates three temporary strings for each file.
The freeze may be caused because some other asynchronous operation is also being awaited, in which case both await
operations may be waiting for each other to finish indefinitely. One way to resolve this is to use ConfigureAwait(false)
in one of them. ConfigureAwait(false)
will cause execution to resume in a background thread instead of the UI thread.
await Task.Run(()=>....).ConfigureAwait(false);
//Still in the background, can't modify the UI here
Nested search and Ignore Inaccessible
It's possible to search for nested files without recursion both in .NET Framework and .NET Core :
var files =Directory.EnumerateFiles(folderPath,"*.png", SearchOptions.AllDirectories)
.Where(...);
In .NET Core 2.1 and later (that includes .NET 5 and later) it's possible to avoid inaccessible files using an EnumerationOptions parameter:
var options =new EnumerationOptions {
IgnoreInaccessible=true,
RecurseSubdirectories=true,
BufferSize = 4096
};
var files =Directory.EnumerateFiles(folderPath,"*.png", options)
.Where(...);
To avoid .GetExtension(s).TrimStart('.').ToLowerInvariant()
, the DirectoryInfo
class and case-insensitive comparisons can be used. The list of extensions should include the leading dot to avoid the need for trimming :
var options =new EnumerationOptions {
IgnoreInaccessible=true,
RecurseSubdirectories=true,
BufferSize = 4096
};
var extensions=new []{".docx",".xlsx"};
var lookup=new HashSet<string>(extensions,StringComparer.OrdinalIgnoreCase);
var directory=new DirectoryInfo(folderPath);
var files = directory.EnumerateFiles(folderPath,"*.*", options)
.Where(fi=>extensions.Contains(fi.Extension));
The SafeWalk
class has become a single call but it won't be executed until it gets enumerated.
One way to do this, in a single background thread, is to use Task.Run
and force the enumeration with ToList()
var files=await Task.Run(()=>directory.EnumerateFiles(folderPath,"*.*", options)
.Where(fi=>extensions.Contains(fi.Extension))
.ToList());
Parallel filtering
You could speed up processing a large number of files by using, eg PLINQ with a limited Degree of Parallelism, to use some, but not all, available cores to filter extensions. Task.Run
is still needed because PLINQ uses the current thread to process data:
var files=await Task.Run(()=>directory.EnumerateFiles(folderPath,"*.*", options)
.AsParallel()
.WithDegreeOfParallelism(2)
.Where(fi=>extensions.Contains(fi.Extension))
.ToList());
CodePudding user response:
You know, I ended up just rewriting the whole thing in a different manner and for whatever reason it works now. I don't know what was happening. Essentially I am doing the same thing.
private async void whatfolder(object sender, RoutedEventArgs e)
{
List<string> myFiles = new List<string>() ;
myFiles = await Task.Run(() =>
{
List<string> mf = new List<string>();
try
{
IEnumerable<string> files = SafeWalk.EnumerateFiles(path, "*.*", so,ct);
foreach(string f in files)
{
if (ct.IsCancellationRequested) { return mf; }
if (ext.Contains(System.IO.Path.GetExtension(f).TrimStart('.').ToLowerInvariant()))
{
mf.Add(f);
}
}
}
catch (Exception ex)
{
}
return mf;
});
}
public static class SafeWalk
{
public static IEnumerable<string> EnumerateFiles(string path, string searchPattern, SearchOption searchOpt, CancellationToken ct)
{
try
{
if (ct.IsCancellationRequested) { return Enumerable.Empty<string>(); }
var dirFiles = Enumerable.Empty<string>();
if (searchOpt == SearchOption.AllDirectories)
{
dirFiles = System.IO.Directory.EnumerateDirectories(path)
.SelectMany(x => EnumerateFiles(x, searchPattern, searchOpt,ct));
}
return dirFiles.Concat(System.IO.Directory.EnumerateFiles(path, searchPattern));
}
catch (UnauthorizedAccessException ex)
{
return Enumerable.Empty<string>();
}
}
}