I have been working on this for a week. I have done a lot of searching and a lot of tests for different methods.
When I use HttpClient to download a file, no errors are generated but the files do not show up in the folder until after the program exits. I have a synchronous method (before someone asks - no I cannot change it to asynchronous) that calls an asynchronous method to download language files for my Tesseract OCR (I test with language == "eng"):
if (!Directory.Exists(folderName))
Directory.CreateDirectory(folderName);
Task.Run(async () => await HelperMethods.LoadLanguage(folderName, language));
Task.Run(async () => await HelperMethods.LoadLanguage(folderName, "osd"));
And here is the method that is being awaited:
public static async Task LoadLanguage(string folderName, string language)
{
string dest = Path.GetFullPath(Path.Combine(folderName, $"{language}.traineddata"));
if (!File.Exists(dest))
{
// Now we know that we need network - start it up if it isn't already.
if (httpClient == null)
httpClient = new HttpClient();
Uri uri = new Uri($"https://github.com/tesseract-ocr/tessdata/raw/main/{language}.traineddata");
HttpResponseMessage response = await httpClient.GetAsync(uri);
using (FileStream fs = new FileStream(dest, FileMode.Create, FileAccess.Write))
{
await response.Content.CopyToAsync(fs);
await fs.FlushAsync();
fs.Close();
}
}
}
I added the Flush and Close as part of the testing, but it did not make a difference.
This is supposed to download the language files and allow the next lines to perform an OCR using those languages. The files are downloaded and are written to the folder (only show up) after the program exits.
If I run the program a second time, it works - because it does not need to download new files.
How do I get this to download the files and save them immediately so the files can be used in subsequent operations?
I also tried this method:
var request = new HttpRequestMessage(HttpMethod.Get, uri);
var sendTask = httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead);
var response = sendTask.Result.EnsureSuccessStatusCode();
var httpStream = await response.Content.ReadAsStreamAsync();
using (var fileStream = File.Create(dest))
{
using (var reader = new StreamReader(httpStream))
{
httpStream.CopyTo(fileStream);
fileStream.Flush();
}
}
No difference. And quite a few other methods of working with the stream. This is .NET framework 4.8 (not able to update to .NET6).
CodePudding user response:
Your post states
I have a synchronous method [...] that calls an asynchronous method to download language files
If the caller is synchronous anyway, why not make the downloader synchronous, too?
public void LoadLanguage(string folderName, string language)
{
Enabled = false;
try
{
Uri uri = new Uri($"https://github.com/tesseract-ocr/tessdata/raw/main/{language}.traineddata");
using (var client = new HttpClient())
{
using (var response =
client
.GetAsync(uri)
.GetAwaiter()
.GetResult())
{
var bytes =
response
.Content
.ReadAsByteArrayAsync()
.GetAwaiter()
.GetResult();
File.WriteAllBytes(
Path.Combine(
folderName,
$"{language}.traineddata"),
bytes
);
}
}
}
finally
{
Enabled = true;
}
}
I tested this and it seems to work [clone]. Does this get you any closer?