Runtime: Add async support to System.Lazy

Created on 30 Sep 2018  路  32Comments  路  Source: dotnet/runtime

Lazy is a very useful class. With async code becoming more and more common we should make Lazy async aware. Doing that requires the following changes:

  1. Make the constructor accept Task-returning factory delegates.
  2. Add an async accessor method: Task<T> GetValueAsync()

If the synchronous T Value { get; } property is used then simply block on the task.

Is ValueTask appropriate to use here?

api-needs-work area-System.Runtime

Most helpful comment

I agree lazy async support is useful, but I'm very hesitant to see it added into the same type. Doing so would allow existing code to be able to use the same type without, for example, changing the type of the field storing the lazy object, but it would still require code changes, e.g. to use the new constructor and the new method or property, at which point that benefit decreases. And adding it into the same type necessarily adds both sync-over-async (i.e. if you construct the Lazy with an async delegate but then use the task-returning property) and async-over-sync (i.e. if you construct the Lazy with a sync delegate but then use the task-returning property, as presumably you'd want to queue the invocation and/or wait for the value to be available in case it was long-running). The existing thread safety modes aren't necessarily as applicable, or at the very least would need to be rationalized. Cancellation likely also becomes relevant.

If we want to add async lazy support, I think it should be added as a dedicated type, and design thought put into what exactly it should look like, potentially with inspiration drawn from Roslyn's and the scenarios it's supporting, issues faced, etc.

All 32 comments

I agree lazy async support is useful, but I'm very hesitant to see it added into the same type. Doing so would allow existing code to be able to use the same type without, for example, changing the type of the field storing the lazy object, but it would still require code changes, e.g. to use the new constructor and the new method or property, at which point that benefit decreases. And adding it into the same type necessarily adds both sync-over-async (i.e. if you construct the Lazy with an async delegate but then use the task-returning property) and async-over-sync (i.e. if you construct the Lazy with a sync delegate but then use the task-returning property, as presumably you'd want to queue the invocation and/or wait for the value to be available in case it was long-running). The existing thread safety modes aren't necessarily as applicable, or at the very least would need to be rationalized. Cancellation likely also becomes relevant.

If we want to add async lazy support, I think it should be added as a dedicated type, and design thought put into what exactly it should look like, potentially with inspiration drawn from Roslyn's and the scenarios it's supporting, issues faced, etc.

I agree that it shouldn't be added to the existing Lazy implementation, as it's already too complicated and does several things.

It's fairly easy to make a sort of AsyncLazy using an extension method, like so:

```c#
static class Program
{
static async Task Main(string[] args)
{
var lazyGreeting = new Lazy>(createGreeting);

        string greeting = await lazyGreeting;
        Console.WriteLine($"{greeting} world!");
    }

    static async Task<string> createGreeting()
    {
        await Task.Delay(400);
        return "hello";
    }

    public static TaskAwaiter<T> GetAwaiter<T>(this Lazy<Task<T>> asyncTask){
        return asyncTask.Value.GetAwaiter();
    }
}

```

The only limitation here is that the constructor call is rather ugly (double nested generic) and it really only supports the PublishAndExecute method, since it will cache the Task right away rather than wait for the result of the task. IMO Lazy doesn't do this well either right now, as mentioned in the referenced issue, so work is needed in this area anyways.

The reason I agree we should make a separate class is because a Task already caches the result (value or exception), just like Lazy does, but it is eager rather than lazy. Maybe it would make more sense to create a LazyTask rather than an AsyncLazy? Something that waits until it is awaited before running the code it is waiting for? That is, instead of making Lazy understand async and tasks. I guess with the new support for ValueTask and other custom awaitable classes this should be possible.

+1 for LazyTask, I think

@GSPP can you turn this into a formal API proposal based on the feedback.

Er ... I think LazyTask is a nice design part, but I think it would be polluting the base library with an idiom. That seems like a pretty long stretch to justify including in a BCL.

Heh also ... Wow! Then every class in the library can have a Task version!

What is a Lazy that consumes a Task?? (Seems to be in fact just a Task.) What is a Task that produces a Lazy?? Bizarre!

It's pretty tough to wrap around the contract in a base Api --- it seems to pivot out into many unknowns. It also becomes something that might spawn threads right in the Api.

2.00 US cents.

I'm not sure what a formal API proposal would entail. But here is my attempt:

```c#
class AsyncLazy
{
AsyncLazy(Func> valueFactory, LazyThreadSafetyMode mode);

bool IsValueCreated { get; }
Task GetValueAsync();
bool TryGetValue(out T value); //non-blocking

string ToString(); //the same as with Lazy
}
```

I dropped the (legacy?) bool isThreadSafe constructor. I find that API design bad because it's not clear what that bool does. Just pass in an enum.

I dropped the constructors that do not take a factory deleagte. I assume these just use the default constructor. There are no async constructors so this does not apply. (In my opinion having these constructors in the first place was a mistake as well but it does not matter for this issue.)

This means that there's just a single constructor. I think that's a good thing. In my opinion the order of arguments is bad because the lambda could be a fairly long string and should be last. But I kept it this way to keep it consistent with Lazy<T>.

My case for having TryGetValue: It enables a high-performance fast path in code that really needs it. It's very easy to implement, doesn't clutter the class much and it seems useful to have.

What to do with LazyThreadSafetyMode? Anything less than ExecutionAndPublication allows for creating multiple tasks. This was reasonable with synchronous code because it enables a performance optimization. (Was there another reason for it? I have never had the need to use any of these less protected modes. If performance is so important then I'd not use the allocating Lazy but use LazyInitializer.) But async code can be assumed to be fairly heavy weight. Likely, it contains IO. It seems the best course of action is to just drop LazyThreadSafetyMode entirely. (In my proposal I left it in until this issue is decided.)

The factory delegate will be called under a lock as before to ensure ExecutionAndPublication. Likely, the delegate returns very quickly and the lock is left very quickly.

There is no async-over-sync or sync-over-async here which was a point mentioned in this thread.

I decided to not make GetValueAsync cancellable. I do not see a need for that.

I decided to not add anything specific to make initialization cancellable. User code can see to it that the initialization task is aborted somehow and completes quickly in that case.

The task that GetValueAsync returns should be cached.

What should happen if the factory delegate throws (in contrast to the task becoming faulted)? I propose we treat it the same way as if the initialization task had faulted. What should happen if the initialization task faults or becomes cancelled? We could just forward the exception into the task returned by GetValueAsync. That's consistent with how Lazy does it. Lazy uses ExceptionDispatchInfo to rethrow the exception. Alternatively, we could wrap exceptions in the style of TargetInvocationException.

An open question is whether this should use ValueTask. Please comment on that since I don't have the experience to advocate for or against.

design thought put into what exactly it should look like, potentially with inspiration drawn from Roslyn's and the scenarios it's supporting, issues faced, etc.

That's a good idea. Somebody needs to go do that now :)

I should say that I will not be able to take on this work. But I hope I made a useful contribution by initiating the discussion and writing up this proposal.

Feedback is very welcome.

To echo @sharwell's concern over on dotnet/corefx#36078, an AsyncLazy<T> type is super-hazardous in an application with a single-threaded SynchronizationContext such as a typical GUI app (or even ASP.NET). We have an implementation in microsoft/vs-threading that mitigates deadlocks that otherwise are super-easy to get into. Unless corefx can pick up JoinableTaskFactory and support that inside the AsyncLazy<T>, or find some way using inversion of dependency to allow for AsyncLazy<T> in corefx to be used safely by injecting that dependency to avoid the problems, I'd be very concerned if corefx should add support for async lazy values.

@AArnott So you're saying Roslyn implementation is incorrect? The problem with vs-threading is that it's VS specific. We need implementation that works anywhere (VS Code, VS4Mac, arbitrary app hosting Roslyn). Perhaps it needs to be parameterized to some extent. That's fine.

So you're saying Roslyn implementation is incorrect?

No. I'm just saying that Roslyn's implementation is not sufficiently generalized for corefx. It requires that Roslyn-esque threading rules be followed (e.g. absolutely no UI thread dependency).

The problem with vs-threading is that it's VS specific.

Why do you say that? The entire vs-threading library is completely independent of VS and it will always be so. It targets .NET Standard as well, and we run tests on all operating systems on both .NET Framework and .NET Core.

In fact it's already running in VS for Mac.

I guess I was confused by vs in vs-threading and VisualStudio in Microsoft.VisualStudio.Threading. :-|

@sharwell @jasonmalinowski Any reason why do we not use AsyncLazy from VS threading in Roslyn?

I guess I was confused by vs in vs-threading

Ya, that's a common ailment of the name, which of course changing at this point would be rather costly.

Any reason why do we not use AsyncLazy from VS threading in Roslyn?

The only feature Roslyn's implementation has that vs-threading's doesn't AFAIK is in multi-cancellation support. That is, yours will actually cancel the value factory if all the clients that requested the value cancel their requests. That's as I understand it anyway. And I'm not opposed to filling that gap either. I can't remember why we haven't done so already aside from lack of anyone asking for it.

No. I'm just saying that Roslyn's implementation is not sufficiently generalized for corefx. It requires that Roslyn-esque threading rules be followed (e.g. absolutely no UI thread dependency).

I'm not sure if Roslyn's AsyncLazy _specifically_ really states any threading policy: you could use it however you want, but synchronously waiting on tasks on a UI thread will deadlock just as badly as waiting on a regular TPL Task will. :smile: The only special case really is ours also supports a GetValue which will do a synchronous wait if there's already an true async path running. (But that's something we need in specialized cases.)

doesn't AFAIK is in multi-cancellation support. That is, yours will actually cancel the value factory if all the clients that requested the value cancel their requests.

Bingo: this is the one big reason we have ours, and why it gets so tricky!

multi-cancellation support

this is the one big reason we have ours, and why it gets so tricky!

Would you switch to the one in vs-threading if we added that feature?

you could use it however you want, but synchronously waiting on tasks on a UI thread will deadlock

Well, I can't use the Roslyn implementation and follow JTF rules, thereby avoiding those deadlocks. Even in Roslyn, synchronously blocking the UI thread is sometimes desirable. VS and other GUI apps have their reasons too. And while Roslyn has very meticulously held any UI thread dependencies from async tasks in such cases at bay, in general that's a very hard problem to solve (perhaps impossible with backward compat if you don't control everything). That's where a JTF-aware AsyncLazy implementation because crucial.

@AArnott taking into account the synchronization context and current task scheduler is a good point. Could we not just call the factory delegate under a null context? That's deterministic and likely what people want. It does not make sense to access the UI in a lazy initialization setting. Normally, initialization is about calling some service or computing something expensive.

I'm a bit confused about what hazards you are seeing. Are you merely concerned about the initialization tasks accessing the UI or is it something else.

Deadlock prevention in asynchronous code historically required global reasoning, which is all but impossible for large applications. Then when applications start refactoring code to use asynchronous operations where synchronous operations were used previously, the prior global reasoning effort is invalidated. @AArnott helped develop a library and set of rules that allow users to migrate applications from synchronous to asynchronous while only relying on local reasoning. The async lazy implementation from Roslyn does not adhere to the rules, so it cannot be used without falling back to global reasoning. If we want to create a generally usable AsyncLazy<T> type, we need to do one of the following:

  1. Define the operations on the type such that they can fit within the "threading rules" (this may be possible by simply disallowing the use of the synchronous GetValue() operation)
  2. Allow a library to somehow inject behavior into the global implementation of this type

I believe the second would establish a very bad precedent.

@sharwell I like your way of describing "local reasoning" vs. "global reasoning".

(this may be possible by simply disallowing the use of the synchronous GetValue() operation)

This would not be sufficient. Because even if an async method called await lazy.GetValueAsync(), that async context (perhaps several layers further down in the callstack) might decide/require to block the main thread till the work is complete. So ultimately whether the AsyncLazy exposes a sync code path or not, the requirement that it may be synchronous still exists.

Could we not just call the factory delegate under a null context?

No, for two reasons:

  1. First off, the value factory itself might need to switch to the UI thread. This may happen directly in the value factory or as an implementation detail of some code that it calls. Some values simply require the UI thread to compute. In other cases, AsyncLazy is used as a way to ensure lazy initialization of an entire class (counting on the side-effect of its evaluation rather than any resulting value) without deadlocking or initializing twice.
  2. AsyncLazy<T> is used in highly perf critical places, where an extra context switch is prohibitively expensive. Our AsyncLazy is written very carefully to invoke the value factory directly on the first caller's callstack to avoid unnecessary context switches. Although our ability to safely do this in the face of some very narrow race conditions has recently been called into question.

I'm a bit confused about what hazards you are seeing.

Consider a value factory that needs the UI thread to complete. It is invoked through AsyncLazy<T>.GetValueAsync (on an arbitrary thread). Now before the value factory is able to reach the UI thread (even if it started on the UI thread, but yielded before needing it), the UI thread decides to block until some async operation is completed, If that async operation will only complete after the value factory completes, then the value factory _must_ be allowed onto the UI thread to avoid a deadlock. This requires some careful dependency tracking for whatever is blocking the UI thread to know which work it needs to allow in to avoid a deadlock, but also avoid letting unrelated work in which can cause crashes, hangs, data corruption, etc. This is what the JoinableTaskFactory is all about, and why we need an AsyncLazy<T> that is JTF-aware in order to play by the rules and avoid such problems.

Would you switch to the one in vs-threading if we added that feature?

Hard to say: there's certain policy decisions Roslyn makes in our implementation (namely what to do in certain async + sync cases) which may or may not be appropriate in all cases. Also for example our AsyncLazy supports a non-caching mode because we have higher caching layers which is a crazy advanced use case.

@AArnott @sharwell I don't understand all the details, but it seems to me there are essentially two scenarios:

1) The code that needs to be executed to retrieve the value is not affinitized to any thread.
There is no need for JTF. Roslyn's AsyncLazy does not have issues when used in this scenario.

2) Part of the code must be executed on a specific (UI) thread.
Something like JTF needs used to coordinate the scheduling process-wise.

Is this a correct way of thinking about this?

I think that's close, but a bit too simple. It's not always clear whether a value factory does or does not have thread affinity.

By _default_, any async method becomes affinitized to the main thread if that's where it started (because of the captured context).
To truly have no thread affinity, you must add .ConfigureAwait(false) for all awaits in the value factory, and in the transitive closure of code that the value factory calls, and know that no code will try to explicitly switch to the UI thread. That can be fine for value factories where you own and are intimately familiar with all the code involved. But _in general_, having such confidence is impossible.

So I'd draw the line more like:

  1. Use any AsyncLazy<T> implementation when you're certain that the value factory never calls, awaits, or blocks on code that may require the main thread to complete.
  2. Use a JTF-aware AsyncLazy<T> in all other cases, or if you're uncertain.

Yes, these rules are basically that I had in mind. These rules are easily followed in pure server code, perhaps less easy in a UI app. It helps if you layer your code in a way that separates any UI logic to a higher layer and purely computational/IO logic to lower layer. Then you only need to worry about callbacks passed in from the higher layer. This is how Roslyn is layered, but I think it's generally good practice rather than Roslyn specific design.

I think it would make sense to provide AsyncLazy that requires following these rules in CoreFX, while another AsyncLazy implementation in a UI oriented library like MS.VS.Threading. Various analyzers accompanying these implementations would help enforcing the rules as much as possible.

@AArnott I believe I found a better set of rules for a type that could be used in many scenarios:

  1. Do not provide synchronous GetValue(). Users can either use Lazy<T> for this or join the result of GetValueAsync in an appropriate manner. This ensures that a sync-to-async transition does not occur inside of AsyncLazy<T>.
  2. Do not provide single-execute guarantees for the asynchronous initializer. It would instead have semantics similar to LazyInitializer.EnsureInitialized<T>(ref T, Func<T>), where the asynchronous value factory can be called more than once but ultimately only one value will be stored. This ensures an asynchronous operation started in one context is not joined in a different context, thus allowing simple async/await calls to be used without breaking JTF rules.

These rules are not without limitations, but they force the type into a category where it can be used with or without JTF.

To truly have no thread affinity, you must add .ConfigureAwait(false) for all awaits in the value factory, and in the transitive closure of code that the value factory calls, and know that no code will try to explicitly switch to the UI thread. That can be fine for value factories where you own and are intimately familiar with all the code involved. But in general, having such confidence is impossible.

This helped be understand the issue.

In what way is the problem unique to async lazy? Should the normal Lazy<T> not be affected just as much?

Is it ever good design to have a lazy initializer body (async or not) access the UI? I have never seen that. Maybe I just haven't come across such code or such a pattern.

Do not provide single-execute guarantees for the asynchronous initializer

This does not work for many scenarios. AsyncLazy must be able to provide at-most-once execution. One example is side-effecting operations. Another example is avoiding the cache stampeding effect where many threads suddenly try to create the same expensive cache value.

This does not work for many scenarios. AsyncLazy must be able to provide at-most-once execution.

This is not strictly true, since the asynchronous initializer method can implement this protection itself when necessary. The big advantage to this is the asynchronous initializer can be written in a manner that correctly considers environmental requirements, including but not limited to JTF.

since the asynchronous initializer method can implement this protection itself when necessary. The big advantage to this

This is the bulk of an "async lazy" implementation. If all of that is left to the delegate to implement, I don't know what the point of the type is.

@sharwell I agree that if the 2 restrictions you prescribed were followed, a corefx AsyncLazy that wasn't JTF aware wouldn't be problematic (aside from the confusion it would introduce as the 3rd copy of AsyncLazy in some customer circles). But I agree with @stephentoub. Doing it right is very tricky. And discovering your mistake is unlikely before you've shipped your (possibly numerous) bugs. We need a type that customers will tend to pick that tends to do things right by default. Right now, the most public type folks can choose from is vs-threading's AsyncLazy<T> (I haven't seen a more publicized alternate), so we're in a good position there. Having a similarly capable type in corefx would be great. Having a less capable type in corefx would risk misleading customers down a path of pain, IMO. Especially considering .NET Core 3.0 is a release focused on native GUI apps.

Is it ever good design to have a lazy initializer body (async or not) access the UI?

@GSPP it's not typically that the value factory accesses UI elements. It's that it may access data structures that are either not thread-safe or are thread affinitized, and thus can only be accessed from the UI thread. And whether it's reading/mutating such data structures, or raising events to such data structures (e.g. the model is updating the view-model), it tends to be a real requirement fairly regularly in my experience.

This is how Roslyn is layered, but I think it's generally good practice rather than Roslyn specific design.

@tmat No doubt Roslyn's design has some good "best practices" implemented in this area, but IMO it's a very high bar, and an unforgiving one. For folks in large apps and who want to progressively migrate their synchronous behaviors to async ones, I daresay it's nearly impossible. Roslyn had the benefit of being able to start from scratch for an enormous feature and could get it "right" from day 1. And the feature has enough weight behind it to influence partner teams to adapt to its threading requirements. When any of those conditions aren't true, folks need an option that allows them to progressively and flexibly migrate them from a sync to an async world, and that's where JTF and JTF-aware primitives become invaluable.

When any of those conditions aren't true, folks need an option that allows them to progressively and flexibly migrate them from a sync to an async world, and that's where JTF and JTF-aware primitives become invaluable.

I certainly agree when we're talking about large UI apps. However, although .NET Core is now adding support for UI apps, server side code (e.g. microservices) has always been .NET Core's main domain and there is no need to worry about JTF there.

Providing two types that are used for these different types of apps would be imo reasonable. If we can make it a single type that can be parameterized then even better.

aside from the confusion it would introduce as the 3rd copy of AsyncLazy in some customer circles

I don't see what the 3rd copy is. I'm proposing a new UI unaware implementation in CoreFX and the existing UI aware implementation in VS Threading. Unless we can have a single one in CoreFX parameterized by UI awareness, that is.

If we can make it a single type that can be parameterized then even better.

Can you elaborate on what you have in mind? The vs-threading AsyncLazy<T> class can be both JTF aware and not, passed on the arguments passed to its constructor. Is that what you had in mind, or were you thinking an dependency injection where the implementation isn't JTF aware at all, but can be made so by injecting something?

I don't see what the 3rd copy is.

I was counting the existing implementations from Roslyn and vs-threading, and the potential corefx one.

I don't know what'd the best design be for a CoreFX primitive. Probably the less dependencies the better.

The Roslyn implementation is internal, we would remove it once we could use a CoreFX alternative.

Or:

public class LazyAsync : Lazy
{
public LazyAsync(Func> constructor) : base(()=> constructor().ConfigureAwait(false).GetAwaiter().GetResult())
{
}
}

@TitaniumIT Using .ConfigureAwait(false) when you're going to immediately call GetAwaiter().GetResult() is completely pointless, just FYI. ConfigureAwait is only meaningful if you'll actually await the thing.

What you're proposing isn't really async either. And doesn't enable something beyond what I could do with just new Lazy<int>(() => GetIntAsync().GetAwaiter().GetResult()).

As the maintainer of a very commonly-used AsyncLazy<T> type, I'd like to just chime in with some observations.

  1. AsyncLazy<T> is essentially just Lazy<Task<T>> plus a GetAwaiter, and that seems simple at first, but it turns out there's a lot more semantic considerations you do have to think about.
  2. The first additional consideration is how the delegate is executed.

    1. Stephen Toub's original AsyncLazy always ran delegates on a thread pool thread. There are several advantages to this approach:



      1. The delegate runs without an async "context" (i.e., it's like having ConfigureAwait(false) everywhere).


      2. The delegate is forced to be asynchronous, preventing a potential problem where the first accessor would run the delegate synchronously when it may be expecting asynchronous behavior.


      3. The delegate is always run in the same way. So it doesn't matter if the lazy is accessed from the UI or a background thread (or ... etc.); the delegate would always run on a thread pool thread.



    2. I do not allow synchronous delegates in my AsyncLazy<T>, so the second potential problem is largely avoided.

    3. However, I do run the delegate on the thread pool by default to ensure it has no context and to ensure a consistent location (the thread pool).

    4. It turns out there are some async-lazy-initialization scenarios where the caller needs the context. So at some point I added some async lazy flags (bleh) to opt-out of the thread pool wrapping.

    5. For my type, this is only for backwards-compatibility reasons. If I were writing it today, I think I would default to running directly in the current context, and the provider of the delegate can use the thread pool or remove the context inside the delegate.

    6. Regarding the final advantage of the thread pool execution (that the delegate is always run in the same way), I have not found this to be an issue in most consuming code. Async lazy values that are accessed from the UI are usually only accessed from the UI. And again, if the delegate needs to run on a thread pool thread, then the provider of the delegate can ensure that themselves.

  3. The existing LazyThreadSafetyMode values don't really translate well. If there are flags/modes for AsyncLazy<T>, they should be a completely separate type.
  4. The semantics around "exception-caching" vs "reset" are especially important in the async world (generally speaking, asynchronous code is more susceptible to exogenous exceptions).

    1. For Lazy<T>, the default LazyThreadSafetyMode.ExecutionAndPublication changes its exception-caching/reset semantics based on whether a delegate is provided, which I've found surprising. So Lazy<T> can be used in a way that does exception-caching, and it can be used in a way that allows it to reset to an uninitialized state on exceptions.

    2. The initial design for my AsyncLazy<T> always cached exceptions. I'm pretty sure I would keep this decision the same if I were to redesign it today.

  5. Resetting on exceptions is also an important use case. I've addressed this by adding another async lazy flag that will reset the AsyncLazy<T> to an uninitialized state if an exception is thrown from the delegate. All existing accessors see the exception, but the next accessor will retry the delegate.
  6. Now that my AsyncLazy<T> has the ability to reset on exceptions, I've been asked to provide an additional method to allow resetting the AsyncLazy<T> after it has completed. This would not be hard to implement, but it seems wrong to add this to a "lazy" type.

    1. What my users are actually asking for is essentially an AsyncCacheEntry<T>. I am planning on adding a true async cache that will have more cache-like default behaviors, instead of trying to force that behavior into AsyncLazy<T>.

    2. It is interesting that AsyncLazy<T> over the last several years has gradually transformed into something that is almost a single-item async cache. I'm not sure if an async cache could completely replace the need for an AsyncLazy<T> or not.

Was this page helpful?
0 / 5 - 0 ratings