The application creates a collection of tasks to obtain folder info from a specific folder path.
This code performs well, but my issue is that the UI is freezing while adding tasks to the list, especially in the while loop, until all tasks are complete.
Here's the code:
Async Function ProcessFolders(ct As CancellationToken, FolderList As IEnumerable(Of
String)) As Task
' Create a collection of tasks
Dim processFoldersQuery As IEnumerable(Of Task(Of String())) =
From folder In FolderList Select GetFolderInfo(folder, ct)
Dim processQuery As List(Of Task(Of String())) = processFoldersQuery.ToList()
' Run through tasks
While processQuery.Count > 0
Dim finishedTask As Task(Of String()) = Await Task.WhenAny(processQuery)
processQuery.Remove(finishedTask)
Dim result = Await finishedTask
folder_dt.Rows.Add(Nothing, result(0), result(1), result(2))
End While
End Function
Async Function GetFolderInfo(folder As String, ct As CancellationToken) As Task(Of
String())
Return Await Task.Run(Async Function()
Dim folder_info = New DirectoryInfo(folder)
Dim result() As String = {folder_info.Name, Await
GetFileMd5(folder, False), Await GetDirectorySize(folder)}
Return result
End Function)
End Function
How to achieve this without UI freezing? I have been looking into parallel and async loops and various different async techniques, but I am not sure how to implement them in this situation.
GetFileMd5() and GetDirectorySize() functions below:
Shared Async Function GetDirectorySize(path As String) As Task(Of Long)
Return Await Task.Run(Function()
Return Directory.GetFiles(path, "*", SearchOption.AllDirectories).Sum(Function(t) (New FileInfo(t).Length))
End Function)
End Function
Private Async Function GetFileMd5(file_name As String, convert_file As Boolean) As Task(Of String)
Return Await Task.Run(Function()
Dim byteHash() As Byte
Dim ByteArrayToString = Function(arrInput() As Byte)
Dim sb As New Text.StringBuilder(arrInput.Length * 2)
For i As Integer = 0 To arrInput.Length - 1
sb.Append(arrInput(i).ToString("X2"))
Next
Return sb.ToString().ToLower
End Function
If convert_file Then
byteHash = Md5CSP.ComputeHash(File.OpenRead(file_name))
Return ByteArrayToString(byteHash)
End If
If convert_file = False Then byteHash = Md5CSP.ComputeHash(System.Text.Encoding.Unicode.GetBytes(file_name)) : Return ByteArrayToString(byteHash)
Return Nothing
End Function)
End Function
CodePudding user response:
A couple of examples that may mitigate a form of lagging you're experiencing when the number of items in the list of folders you're processing increases.
In the current code, these two lines:
processQuery.Remove(finishedTask)
' [...]
folder_dt.Rows.Add(Nothing, result(0), result(1), result(2))
are executed in the UI Thread. In my view, the former is troublesome.
It's also not clear how the DataTable is used. It appears that when ProcessFolders()
(which doesn't return that object) returns, a previously existing DataTable is blindly assigned to something.
ProcessFolders()
shouldn't know about this, already existing, DataTable.
In the first example, the entry method (renamed ProcessFoldersAsync()
), creates a DataTable that is meant to contain the data it generates and returns this DataTable.
You should pass the CancellationToken also to the GetFileMd5()
and GetDirectorySize()
methods.
Private cts As CancellationTokenSource = Nothing
Private folder_dt As DataTable = Nothing
Private Async Sub SomeButton_Click(sender As Object, e As EventArgs) Handles SomeButton.Click
Dim listOfFolders As New List(Of String) = [Some source of URIs]
cts = New CancellationTokenSource
Using cts
' Await resumes in the UI Thread
folder_dt = Await ProcessFoldersAsync(listOfFolder, cts.Token)
[Some Control].DataSource = folder_dt
End Using
End Sub
Async Function ProcessFoldersAsync(folderList As IEnumerable(Of String), dtSchema As DataTable, token As CancellationToken) As Task(Of DataTable)
Dim processQuery As New List(Of Task(Of Object()))
For Each f In folderList
processQuery.Add(GetFolderInfoAsync(f, token))
Next
If token.IsCancellationRequested Then Return Nothing
Try
' Resumes on a ThreadPool Thread
Await Task.WhenAll(processQuery).ConfigureAwait(False)
' Generate a new DataTable and fills it with the results of all Tasks
' This code executes in a ThreadPool Thread
Dim dt = CreateDataTable()
For Each obj In processQuery
If obj.Result IsNot Nothing Then dt.Rows.Add(obj.Result)
Next
Return dt
Catch ex As TaskCanceledException
Return Nothing
End Try
End Function
Async Function GetFolderInfoAsync(folder As String, token As CancellationToken) As Task(Of Object())
token.ThrowIfCancellationRequested()
Dim folderName = Path.GetFileName(folder)
Return {Nothing, folderName, Await GetFileMd5(folder, False), Await GetDirectorySize(folder)}
End Function
In the second example, an IProgress<T>
delegate is used perform the updates it receives from the GetFolderInfoAsync()
method.
In this case, the DataTable already exists and could be already assigned to / used by Controls. In this case, the updates are performed in real time, as the async methods return their results, which may happen at different times.
This method may slow down the process, overall.
ProcessFoldersAsync()
passes the Progress object delegate to GetFolderInfoAsync()
, which calls Report()
method with the data object it has elaborated.
The UpdatedDataTable()
delegate method adds the new data arrived to the existing DataTable
Note that, in this case, if the Tasks are canceled, the updates already stored are retained (unless you decide otherwise, that is. You can always set the DataTable to null
when you catch a TaskCanceledException
exception)
' [...]
Private objLock As New Object()
Private updater As IProgress(Of IEnumerable(Of Object))
Private Async Sub SomeButton_Click(sender As Object, e As EventArgs) Handles SomeButton.Click
Dim listOfFolders As New List(Of String) = [Some source of URIs]
folder_dt = CreateDataTable()
[Some Control].DataSource = folder_dt
cts = New CancellationTokenSource
Using cts
updater = New Progress(Of IEnumerable(Of Object))(Sub(data) UpdatedDataTable(data))
Try
Await ProcessFoldersAsync(listOfFolder, updater, cts.Token)
Catch ex As TaskCanceledException
Debug.WriteLine("Tasks were canceled")
End Try
End Using
End Sub
Async Function ProcessFoldersAsync(folderList As IEnumerable(Of String), updater As IProgress(Of IEnumerable(Of Object)), token As CancellationToken) As Task
Dim processQuery As New List(Of Task)
For Each f In folderList
processQuery.Add(GetFolderInfoAsync(f, updater, token))
Next
token.ThrowIfCancellationRequested()
Await Task.WhenAll(processQuery).ConfigureAwait(False)
End Function
Async Function GetFolderInfoAsync(folder As String, progress As IProgress(Of IEnumerable(Of Object)), token As CancellationToken) As Task
token.ThrowIfCancellationRequested()
Dim folderName = Path.GetFileName(folder)
Dim result As Object() = {Nothing, folderName, Await GetFileMd5(folder, False), Await GetDirectorySize(folder)}
progress.Report(result)
End Function
Private Sub UpdatedDataTable(data As IEnumerable(Of Object))
SyncLock objLock
folder_dt.Rows.Add(data.ToArray())
End SyncLock
End Sub
Private Function CreateDataTable() As DataTable
Dim dt As New DataTable()
Dim col As New DataColumn("IDX", GetType(Long)) With {
.AutoIncrement = True,
.AutoIncrementSeed = 1,
.AutoIncrementStep = 1
}
dt.Columns.Add(col)
dt.Columns.Add("FolderName", GetType(String))
dt.Columns.Add("MD5", GetType(String))
dt.Columns.Add("Size", GetType(Long))
Return dt
End Function