@cocowalla surfaces an alternative approach to deal with multiple change events. The current approach in the topic+sample uses file hashes (duplicate handling) and an exponential back-off (delay handling). Let's consider revising the current approach or adding the alternate approach. If we add a new approach, we'll add it into the topic (and possibly sample app) over cross-linking. Keep an eye on the issue, as engineering may take it on in a future release.
Let's look at this early next year. We need to get past 2.2 RTM and a few other high priority issues.
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
PV 1.2 K
This is now tracked in #10032
ok on the close, but fyi tho ... #10032 is a different issue and unrelated to this one.
@cocowalla, I'll monitor the engineering topic where the discussion took place. If the engineers make a move, then I'll make a move on the sample. If they call for changes to the topic+sample directly, of course, I'll react to that over here. For now tho, we'll go with what we have (checking file hashes with an exponential backoff).
ok on the close, but fyi tho ... #10032 is a different issue and unrelated to this one.
True, but we don't need two issues for low traffic docs.
@Rick-Anderson I'm not sure if you meant to be so disengaging, but I'm reading this as a rather contemptuous response, especially given you closed this issue saying it was tracked in #10032 - which it isn't.
You then say:
we don't need two issues for low traffic docs
I don't understand this - had a single issue been opened for 2 completely different things, I imagine you would (rightly!) not be happy with that. Different issues should be... well, different GitHub Issues.
@guardrex nobody is likely to 'make a move' on issues that are closed :roll_eyes:
@guardrex @Rick-Anderson given this has essentially been closed as a dupe, and it's not, can we look at re-opening this please?
@cocowalla I wasn't clear. When I wrote, "make a move," I meant that I would _open a new issue._
@cocowalla We're not dropping the issue. I just added it to the top of #10032
Thanks for contacting us.
We don’t have the resources to invest in this area, so we are closing the issue. Should your request generate enough 👍 responses, we’ll reconsider.
Both this and #10032 are closed - but what happened specifically with this issue in the end - was there any progress?
@dazinator Nothing on the docs side. We're still using the same approaches: file hashes (duplicate handling) and an exponential back-off (delay handling). There was discussion https://github.com/dotnet/aspnetcore/issues/2542, but it was closed as a won't fix. My approach was deemed "reasonable" (that's actually pretty good for my RexHacks:tm: around here :smile:). There's no additional info from engineering AFAIK, but you can search their issues to see if someone asked elsewhere. I have no idea at this point what you may find on SO or in chat ... it's been a while since the scenarios were addressed, and I haven't looked again.
@guardrex ok thanks. The reason being is I'm still seeing OptionsMonitor.OnChange() fire twice in quick succession when saving a change to the appsettings.json file at runtime. I wonder if engineering have any tests in place for this scenario -my understandind is that with the new file hash checking logic in place, is should only fire once?
the new file hash checking logic
If you're referring to something new that they did in the framework, perhaps so. You'd have to ask them about that. I'm not aware of any changes on the framework side that pertain to the multiple firing. If I learn more about that, then this topic+sample can receive updates that should shed its workarounds.
If you're referring to what's in the topic ... the file hashes and back-off that I put in there, no, it's a workaround for multiple firings. Multiple firing was not _directly_ addressed by my hacks.
@guardrex
I see. So the workaround is this one yeah?
I think from glancing there might be a possible issue with that hack! From glancing, if InvokeChanged() method is fired in parallel, then your if check can be entered into by multiple threads at once (as no locking is used) so your logic could run more than once:
if (!_appsettingsHash.SequenceEqual(appsettingsHash) ||
!_appsettingsEnvHash.SequenceEqual(appsettingsEnvHash))
{
_appsettingsHash = appsettingsHash;
_appsettingsEnvHash = appsettingsEnvHash;
WriteConsole("Configuration changed (Simple Startup Change Token)");
}
That aside (I appreciate this is just a hack to give the developer an idea - but that leads into my next point), may I formally lodge an expression of my discontent over the way this issue has been chucked over the fence from engineering. Without any tests in place, and only a link to a suggested hack (sorry - just using your terminology here, it's not a dig at your work!), it appears almost like this issue has not really been taken too seriously by engineering. Let me qualify that: It's a known issue that they are happy to leave in the framework, to be worked around by everyone who hits it, and people will hit it. When they do, even though official workarounds could be provided that have test coverage on the engineering side, it's left instead to your hack, which... may have issues, and therefore the developer will have to write their own implementation in the spirit of your solution, rather than adopting something that is already robust for solving this issue.
I also think that this example of debouncing change tokens based on re-reading the file and computing a hash, may be ok in some situations, but likely not all. It has undesirable characteristics especially when scaling. Could it be worth noting these issues with it in the docs?
In a web farm scenario (say scaling on Azure App Service) - if monitoring for a change to a central config file, when that file changes, all nodes in the cluster (which could be very large) will all attempt to re-open the changed file at the same time, and read it to compute a new hash. This places a sudden load on the file system (and accross the network if using network based storage), that could be avoided with other approaches.
If the file is quite large, this magnifies the problem even more, and I am also guessing the it will take longer to compute the hash. Perhaps too long? https://stackoverflow.com/questions/3837737/improve-performance-of-sha-1-computehash
There is an alternative method that avoids these characteristics, i.e the configurable delay method as described here.
but has its own negative:
However I can see that both approaches could be used as viable workarounds for this source of errors. Would engineering be prepared to add some tests for a few of these workarounds, and the docs link to those tests instead of your sample code, to proof that these workarounds are "supported" and work?
Other than that, it's essentially relying on the community to release a nuget package to offer up the patches for this hole.
It's a hack/workaround, and I agree that this one is best in the simplest cases. I felt that I had to do something lol. The multiple firings were a pain point. On review when we did this, I didn't discuss the concerns that you raised or any other approaches. They just said that it's reasonable, and we moved on. 🏃😅
If you open a new engineering issue (https://github.com/dotnet/aspnetcore/issues) to discuss further with them:
cc: @guardrex to the bottom of your opening comment so that I can follow the discussion.@guardrex done: https://github.com/dotnet/aspnetcore/issues/19401
Most helpful comment
@guardrex
I see. So the workaround is this one yeah?
I think from glancing there might be a possible issue with that hack! From glancing, if
InvokeChanged()method is fired in parallel, then your if check can be entered into by multiple threads at once (as no locking is used) so your logic could run more than once:That aside (I appreciate this is just a hack to give the developer an idea - but that leads into my next point), may I formally lodge an expression of my discontent over the way this issue has been chucked over the fence from engineering. Without any tests in place, and only a link to a suggested hack (sorry - just using your terminology here, it's not a dig at your work!), it appears almost like this issue has not really been taken too seriously by engineering. Let me qualify that: It's a known issue that they are happy to leave in the framework, to be worked around by everyone who hits it, and people will hit it. When they do, even though official workarounds could be provided that have test coverage on the engineering side, it's left instead to your hack, which... may have issues, and therefore the developer will have to write their own implementation in the spirit of your solution, rather than adopting something that is already robust for solving this issue.
I also think that this example of debouncing change tokens based on re-reading the file and computing a hash, may be ok in some situations, but likely not all. It has undesirable characteristics especially when scaling. Could it be worth noting these issues with it in the docs?
In a web farm scenario (say scaling on Azure App Service) - if monitoring for a change to a central config file, when that file changes, all nodes in the cluster (which could be very large) will all attempt to re-open the changed file at the same time, and read it to compute a new hash. This places a sudden load on the file system (and accross the network if using network based storage), that could be avoided with other approaches.
If the file is quite large, this magnifies the problem even more, and I am also guessing the it will take longer to compute the hash. Perhaps too long? https://stackoverflow.com/questions/3837737/improve-performance-of-sha-1-computehash
There is an alternative method that avoids these characteristics, i.e the configurable delay method as described here.
but has its own negative:
However I can see that both approaches could be used as viable workarounds for this source of errors. Would engineering be prepared to add some tests for a few of these workarounds, and the docs link to those tests instead of your sample code, to proof that these workarounds are "supported" and work?
Other than that, it's essentially relying on the community to release a nuget package to offer up the patches for this hole.