I am working on a "heartbeat" application that pings hundreds of IP addresses every minute via a loop. The IP addresses are stored in a list of a class Machines
. I have a loop that creates a Task<MachinePingResults>
(where MachinePingResults
is basically a Tuple of an IP and online status) for each IP and calls a ping function using System.Net.NetworkInformation
.
The issue I'm having is that after hours (or days) of running, one of the loops of the main program fails to finish the Tasks
which is leading to a memory leak. I cannot determine why my Tasks are not finishing (if I look in the Task list during runtime after a few days of running, there are hundreds of tasks that appear as "awaiting"). Most of the time all the tasks finish and are disposed; it is just randomly that they don't finish. For example, the past 24 hours had one issue at about 12 hours in with 148 awaiting tasks that never finished. Due to the nature of not being able to see why the Ping
is hanging (since it's internal to .NET), I haven't been able to replicate the issue to debug.
(It appears that the Ping
call in .NET can hang and the built-in timeout fail if there is a DNS issue, which is why I built an additional timeout in)
I have a way to cancel the main loop if the pings don't return within 15 seconds using Task.Delay
and a CancellationToken
. Then in each Ping function I have a Delay
in case the Ping call itself hangs that forces the function to complete. Also note I am only pinging IPv4; there is no IPv6 or URL.
Main Loop
pingcancel = new CancellationTokenSource();
List<Task<MachinePingResults>> results = new List<Task<MachinePingResults>>();
try
{
foreach (var m in localMachines.FindAll(m => !m.Online))
results.Add(Task.Run(() =>
PingMachine(m.ipAddress, 8000), pingcancel.Token
));
await Task.WhenAny(Task.WhenAll(results.ToArray()), Task.Delay(15000));
pingcancel.Cancel();
}
catch (Exception ex) { Console.WriteLine(ex); }
finally
{
results.Where(r => r.IsCompleted).ToList()
.ForEach(r =>
//modify the online machines);
results.Where(r => r.IsCompleted).ToList().ForEach(r => r.Dispose());
results.Clear();
}
The Ping Function
static async Task<MachinePingResults> PingMachine(string ipaddress, int timeout)
{
try
{
using (Ping ping = new Ping())
{
var reply = ping.SendPingAsync(ipaddress, timeout);
await Task.WhenAny(Task.Delay(timeout), reply);
if (reply.IsCompleted && reply.Result.Status == IPStatus.Success)
{
return new MachinePingResults(ipaddress, true);
}
}
}
catch (Exception ex)
{
Debug.WriteLine("Error: " ex.Message);
}
return new MachinePingResults(ipaddress, false);
}
With every Task
having a Delay to let it continue if the Ping hangs, I don't know what would be the issue that is causing some of the Task<MachinePingResults>
to never finish.
How can I ensure a Task
using .NET Ping
ends?
Using .NET 5.0 and the issues occurs on machines running windows 10 and windows server 2012
CodePudding user response:
There are quite a few gaps in the code posted, but I attempted to replicate and in doing so ended up refactoring a bit.
This version seems pretty robust, with the actual call to SendAsync
wrapped in an adapter class.
I accept this doesn't necessarily answer the question directly, but in the absence of being able to replicate your problem exactly, offers an alternative way of structuring the code that may eliminate the problem.
async Task Main()
{
var masterCts = new CancellationTokenSource(TimeSpan.FromSeconds(15)); // 15s overall timeout
var localMachines = new List<LocalMachine>
{
new LocalMachine("192.0.0.1", false), // Should be not known - TimedOut
new LocalMachine("192.168.86.88", false), // Should be not known - DestinationHostUnreachable (when timeout is 8000)
new LocalMachine("www.dfdfsdfdfdsgrdf.cdcc", false), // Should be not known - status Unknown because of PingException
new LocalMachine("192.168.86.87", false) // Known - my local IP
};
var results = new List<PingerResult>();
try
{
// Create the "hot" tasks
var tasks = localMachines.Where(m => !m.Online)
.Select(m => new Pinger().SendPingAsync(m.HostOrAddress, 8000, masterCts.Token))
.ToArray();
await Task.WhenAll(tasks);
results.AddRange(tasks.Select(t => t.Result));
}
finally
{
results.ForEach(r => localMachines.Single(m => m.HostOrAddress.Equals(r.HostOrAddress)).Online = r.Status == IPStatus.Success);
results.Dump(); // For LINQPad
localMachines.Dump(); // For LINQPad
results.Clear();
}
}
public class LocalMachine
{
public LocalMachine(string hostOrAddress, bool online)
{
HostOrAddress = hostOrAddress;
Online = online;
}
public string HostOrAddress { get; }
public bool Online { get; set; }
}
public class PingerResult
{
public string HostOrAddress {get;set;}
public IPStatus Status {get;set;}
}
public class Pinger
{
public async Task<PingerResult> SendPingAsync(string hostOrAddress, int timeout, CancellationToken token)
{
// Check if timeout has occurred
token.ThrowIfCancellationRequested();
IPStatus status = default;
try
{
var reply = await SendPingInternal(hostOrAddress, timeout, token);
status = reply.Status;
}
catch (PingException)
{
status = IPStatus.Unknown;
}
return new PingerResult
{
HostOrAddress = hostOrAddress,
Status = status
};
}
// Wrap the legacy EAP pattern offered by Ping.
private Task<PingReply> SendPingInternal(string hostOrAddress, int timeout, CancellationToken cancelToken)
{
var tcs = new TaskCompletionSource<PingReply>();
if (cancelToken.IsCancellationRequested)
{
tcs.TrySetCanceled();
}
else
{
using (var ping = new Ping())
{
ping.PingCompleted = (object sender, PingCompletedEventArgs e) =>
{
if (!cancelToken.IsCancellationRequested)
{
if (e.Cancelled)
{
tcs.TrySetCanceled();
}
else if (e.Error != null)
{
tcs.TrySetException(e.Error);
}
else
{
tcs.TrySetResult(e.Reply);
}
}
};
cancelToken.Register(() => { tcs.TrySetCanceled(); });
ping.SendAsync(hostOrAddress, timeout, new object());
}
};
return tcs.Task;
}
}
EDIT:
I've just noticed in the comments that you mention pinging "all 1391". At this point I would look to throttle the number of pings that are sent concurrently using a SemaphoreSlim
. See this blog post (from a long time ago!) that outlines the approach: https://devblogs.microsoft.com/pfxteam/implementing-a-simple-foreachasync/