I was used to WaitAndRetryForeverAsync
in the past which was wrong because I believe the Retry pattern is supposed to handle only transient faults, such as rate limiting, 429 status code, etc. At the moment that the API I was subscribing to went offline for service maintenance which took about 25 minutes, WaitAndRetryForeverAsync
was retrying forever in a constant interval (not exponential which doesn't really matter in this case) which in fact triggered some firewall rules on the API side and my IP was blocked for a while.
I'm trying to do what Nick Chapsas says in his Circuit Breaker video, i.e. if it fails to retry 5 times -> we make the assumption that the service is in maintenance. So enable the retries after 30 minutes and so on and so forth until it reconnects, even if it takes hours to do (depending on how long the service maintenance is).
The question is how do I achieve that circuit breaker policy after WaitAndRetry
's failure?
/// <summary>
/// This class provides Transient Fault Handling extension methods.
/// </summary>
internal static class Retry
{
public static void Do(Action action, TimeSpan retryInterval, int retryCount = 3)
{
_ = Do<object?>(() =>
{
action();
return null;
}, retryInterval, retryCount);
}
public static async Task DoAsync(Func<Task> action, TimeSpan retryInterval, int retryCount = 3)
{
_ = await DoAsync<object?>(async () =>
{
await action();
return null;
}, retryInterval, retryCount);
}
public static T Do<T>(Func<T> action, TimeSpan retryWait, int retryCount = 3)
{
var policyResult = Policy
.Handle<Exception>()
.WaitAndRetry(retryCount, retryAttempt => retryWait)
.ExecuteAndCapture(action);
if (policyResult.Outcome == OutcomeType.Failure)
{
throw policyResult.FinalException;
}
return policyResult.Result;
}
public static async Task<T> DoAsync<T>(Func<Task<T>> action, TimeSpan retryWait, int retryCount = 3)
{
var policyResult = await Policy
.Handle<Exception>()
.WaitAndRetryAsync(retryCount, retryAttempt => retryWait)
.ExecuteAndCaptureAsync(action);
if (policyResult.Outcome == OutcomeType.Failure)
{
throw policyResult.FinalException;
}
return policyResult.Result;
}
}
CodePudding user response:
Simplest solution
If you want to have the following delay sequences:
- 15sec, 15sec, 15sec, 15sec, 15sec, 30min
- 15sec, 15sec, 15sec, 15sec, 15sec, 30min
- etc.
Then you can achieve this by introducing by the following helper method:
static IEnumerable<TimeSpan> GetSleepDuration()
{
while(true)
{
for (int i = 0; i < 5; i )
{
yield return TimeSpan.FromSeconds(15);
}
yield return TimeSpan.FromMinutes(30);
}
}
The usage is pretty simple:
var sleepDurationProvider = GetSleepDuration().GetEnumerator();
var retry = Policy
...
.WaitAndRetryForever(_ =>
{
sleepDurationProvider.MoveNext();
return sleepDurationProvider.Current;
});
Under this SO question I have showed a somewhat similar solution.
Sophisticated solution
By default the policies are unaware of each other, even if they are chained together via PolicyWrap
. One way to exchange information between policies is the usage of Polly's Context.
First let's define the Circuit Breaker
const string SleepDurationKey = "Broken";
IAsyncPolicy<HttpResponseMessage> GetCircuitBreakerPolicy()
{
return Policy<HttpResponseMessage>
.HandleResult(res => res.StatusCode == HttpStatusCode.TooManyRequests)
.CircuitBreakerAsync(6, TimeSpan.FromMinutes(30),
onBreak: (dr, ts, ctx) => ctx[SleepDurationKey] = ts,
onReset: (ctx) => ctx.Remove(SleepDurationKey));
}
- It breaks after 6 failed attempts and it will remain broken for 30 minutes
- When the cb transitions to
Open
then the Context will contain the sleep duration - When the cb transitions to
Closed
then the Context will not contain anymore the sleep duration
Please be aware that with this setup the delay sequence will look like this:
- 15sec, 15sec, 15sec, 15sec, 15sec, 30min, 30min, 30min, etc.
And finally let's define the retry policy
IAsyncPolicy<HttpResponseMessage> GetRetryPolicy()
{
return Policy<HttpResponseMessage>
.HandleResult(res => res.StatusCode == HttpStatusCode.TooManyRequests)
.Or<BrokenCircuitException>()
.WaitAndRetryForeverAsync((_, ctx) =>
ctx.ContainsKey(SleepDurationKey) ? (TimeSpan)ctx[SleepDurationKey] : TimeSpan.FromSeconds(15));
}
Under this SO question I have showed a somewhat similar solution.
UPDATE #1
If you want to have the same delay sequences like what we had under the simplest solution section but you want to use Circuit Breaker for that then you have to use some workaround.
The necessity of this workaround is that we have a HalfOpen
state as well, not just Closed
and Open
. The above delay sequences would be used out-of-the-box if we would have only Closed
and Open
states. But after the break duration the Circuit Breaker transitions into HalfOpen
(to allow a probe) rather than to Closed
.
At first glance the Advanced Circuit Breaker could provide this "auto-reset feature" because of its samplingDuration
extra parameter. But unfortunately the ACB also has HalfOpen
state.
The workaround is that we force the Circuit Breaker to transition back to Closed
by explicitly calling the Reset
function on it.
So, the solution is the following:
var circuitBreaker = Policy<HttpResponseMessage>
.HandleResult(res => res.StatusCode == HttpStatusCode.TooManyRequests)
.CircuitBreakerAsync(6, TimeSpan.FromSeconds(10),
onBreak: (dr, ts, ctx) => ctx[SleepDurationKey] = ts,
onReset: (ctx) => { });
- I have removed the context clearing logic from the
onReset
to avoid race-condition
var retry = Policy<HttpResponseMessage>
.HandleResult(res => res.StatusCode == HttpStatusCode.TooManyRequests)
.Or<BrokenCircuitException>()
.WaitAndRetryForeverAsync((_, ctx) =>
{
if (ctx.ContainsKey(SleepDurationKey))
{
var sleepDuration = (TimeSpan)ctx[SleepDurationKey];
var resetSignal = new CancellationTokenSource(sleepDuration.Add(TimeSpan.FromSeconds(-1)));
resetSignal.Token.Register(() => { ctx.Remove(SleepDurationKey); circuitBreaker.Reset(); });
return sleepDuration;
}
return TimeSpan.FromSeconds(15);
});
- If the key is present then I retrieve the sleep duration from the context
- I create a timer which should be triggered just before the CB automatically transactions to
HalfOpen
- When it triggers it clears the context and calls the
Reset
on the circuit breaker
- When it triggers it clears the context and calls the
With this "trick" we could skip the HalfOpen
state. Please note that even though the retry
has explicit reference to the circuitBreaker
you still have to use the PolicyWrap
to make it work.
var combined = Policy.WrapAsync(retry, circuitBreaker);
var result = await combinedPolicy.ExecuteAsync(...);