Orleans: Grain does not deactivate on unhandled exception

Created on 3 Oct 2017  路  4Comments  路  Source: dotnet/orleans

Consider the following grain:

using System;
using System.Diagnostics;
using System.Threading.Tasks;
using Orleans;

namespace OrleansTests
{
    public class LifecicleTesterGrain : Grain, ILifecicleTesterGrain
    {
        private int _inMemoryCounter = 0;

        /// <inheritdoc />
        public override async Task OnActivateAsync()
        {
            Trace.WriteLine($"{nameof(LifecicleTesterGrain)} Activated :: {this.GetPrimaryKeyLong()}");
            await base.OnActivateAsync();
        }

        /// <inheritdoc />
        public override Task OnDeactivateAsync()
        {
            Trace.WriteLine($"{nameof(LifecicleTesterGrain)} Deactivated :: {this.GetPrimaryKeyLong()}");
            return base.OnDeactivateAsync();
        }

        /// <summary>
        /// This method is designed to test grain deactivation when an exception is unhandled. We expect that calling this method multiple times
        /// on the same grain virtual instance would result in the counter being reseted to 0 every time the exception is thrown.
        /// </summary>
        public Task<int> TestDeactivationOnThrow()
        {
            Trace.WriteLine($"{nameof(LifecicleTesterGrain)} TestDeactivationOnThrowing.Start :: {this.GetPrimaryKeyLong()}");

            // Increase the in memory counter.
            _inMemoryCounter++;

            // Maybe throw and exception.
            if (new Random().Next(99) > 66)
            {
                Trace.WriteLine($"{nameof(LifecicleTesterGrain)} TestDeactivationOnThrowing.Throwing :: {this.GetPrimaryKeyLong()}");

                throw new Exception("This is a random exception designed to test grain deactivation.");
            }

            Trace.WriteLine($"{nameof(LifecicleTesterGrain)} TestDeactivationOnThrowing.End :: {this.GetPrimaryKeyLong()}");

            // Return the current value of the in memory counter.
            return Task.FromResult(_inMemoryCounter);
        }
    }

    public interface ILifecicleTesterGrain : IGrainWithIntegerKey
    {
        Task<int> TestDeactivationOnThrow();
    }

}

Given our current understanding of how Orleans behaves, we would expect that if an exception escapes the Grain the Orleans Runtime would automatically deactivate the Grain to prevent it from running in an inconsistent state.

However, running the example above, we observed that the counter is never reset to zero, which means the Grain is not deactivated when the exception is thrown. This is also verifiable by reading the trace logs.

Questions:

  1. Should the Grain be deactivated when an exception escapes the Grain, that is, is not caught and handled within the Grain context.
  2. If not, what is the behavior we should expect from a Grain in case an unhadled exception.

A bit of context: One of our grains went into an inconsistent state after failing to write to storage. It acquired a wrong ETag and continued to run for a few hours without persisting its state. From the front end perspective the grain was healthy, since the in-memory state was consistent all the time. After a few hours it was finally deactivated and all the state changes were lost.

documentation

All 4 comments

In 1.4.0:

  1. No
  2. The exception is propagated back to the caller.

Essentially, this is what an object in .NET would do: continue living even if some method throws an exception. Consider what would happen if you had a grain method which accepted user input, throwing an exception if the input is invalid. The grain would reset every time the user entered some invalid input. Exceptions aren't always caused by system faults and the cause of an exception is often not resolved by restarting a grain.

It would be a violation of the principle of least surprise if an activation was deactivated on every exception.

In 1.5.0, a change was introduced to deactivate an activation when InconsistentStateException and subclasses of that exception are thrown from a grain method (i.e, not handled by the grain or any IGrainCallFilter).

EDIT: Perhaps this should be highlighted in a doc page about error handling?

Thanks Reuben,

For some reason we assumed the grains would deactivate in case of an exception. There is no explicit mention that this would happen in the documentation, and neither that it wouldn't. So an update to the documentation would be nice. =)

Personally I don't have a preference for one way or the other, and I can find arguments to defend either approach.

We mention InconsistentStateException in http://dotnet.github.io/orleans/Documentation/Core-Features/Grain-Persistence.html#storage-provider-semantics but never describe handling of it. http://dotnet.github.io/orleans/Documentation/Benefits.html has "Automatic propagation of errors" as one of the key features of the Orleans model, but we don't have a dedicated doc page on error handling. Maybe we need to add one to the Core Features section?

The notion of grains failing due to an exception is something that problem people get from process crashes in Erlang and maybe similar behavior in Akka (I'm not familiar with Akka ).

I've added a not good failure handling page back then to our tutorial at https://dotnet.github.io/orleans/Tutorials/Failure-Handling.html

But it is not good and also it is written in style of the tutorial docs with an example instead of describing the concept.

Maybe we need a page describing how and when a grain can fail or maybe in the life-cycle part of the grain doc, mention that grains don't fail if an exception is thrown in them (here https://dotnet.github.io/orleans/Documentation/Getting-Started-With-Orleans/Grains.html)

Was this page helpful?
0 / 5 - 0 ratings