Thursday, March 16, 2017

Transient fault handling and retry strategy



Transient fault handling and retry strategy

While working with webservices in the middleware layer (say a subscriber to a service bus queue), it is wise to use a good retry strategy for transient fault handling for scenarios such as – connectivity issues, timeout issues etc. Retrying immediately will not help much as the chances of the resource being unavailable would be high. The code below explains ExponentialBackoff strategy. Exponentialbackoff retry strategy means - It ll retry after 2seconds, 4seconds, 8seconds.  

Explanation of ExponentialBackoff parameters –

var retryStrategy = new ExponentialBackoff(3, TimeSpan.FromSeconds(2),
                    TimeSpan.FromSeconds(20), TimeSpan.FromSeconds(1));
a)      first param - number of retries
b)      second param - minimum backoff time (say 2seconds). Meaning - time it should wait 2 seconds before the first RETRY or minimum time between any retries.
c)      third param - maximum backoff limit.  In this case - the third retry will happen after about ~ 2 + 4 + 8 = 14 seconds. The third param is there to put a limit on backoff period (which is 20 in this case).
d)      fourth param - delta to add some randomness, otherwise all clients would be retrying simultaenously.
in this case random value is 1 second. So 2,4,8 could be 1,5,7 etc.


Please find the code files and detailed steps below –


1)      Add a nuget package to EnterpriseLibrary.TransientFaultHandling.
2)      Create the following class (to specify errors/exceptions which should be retried) –
public class TransientErrorDetectionStrategy : ITransientErrorDetectionStrategy
    {
        // add the list of exceptions, that you consider transient.
        public bool IsTransient(Exception ex)
        {
            if (ex is WebException)
                return true;
            if (ex is FakeTimeoutException)
                return true;
            return false;
        }
    }

3)      Create a FakeTimeoutException (for demo/testing purpose) –
public class FakeTimeoutException : Exception
    {
        public FakeTimeoutException(string msg) : base(msg)
        {

        }
    }

4)      Create the main class and method as shown below –

class Program
    {
        static void Main(string[] args)
        {
            var retryStrategy = new ExponentialBackoff(3, TimeSpan.FromSeconds(2),
                    TimeSpan.FromSeconds(20), TimeSpan.FromSeconds(1));
            TransientErrorDetectionStrategy errorDetectionStrategy = new TransientErrorDetectionStrategy();

            var retryPolicy = new RetryPolicy(errorDetectionStrategy,retryStrategy);

            retryPolicy.ExecuteAction(() =>
            ExecuteHTTPGET("https://microsoft.sharepoint.com")
            );

        }

        private static void ExecuteHTTPGET(string requestUri)
        {           
            Console.WriteLine(DateTime.Now);
            throw new FakeTimeoutException("fake timeout");
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(requestUri);
            request.KeepAlive = false;
            request.Method = "GET";

            HttpWebResponse webResponse = (HttpWebResponse)request.GetResponse();
            int requestStatus = (int)webResponse.StatusCode;
            webResponse.Close();
        }

5)     Run the program and see the timestamp in the console. Try by switching FakeTimeoutException with SomeOtherException and see the behavior.