Which Version of MSAL are you using ?
4.17.1
What authentication flow has the issue?
Repro
See the repro with Ramjot (offline)
See Product Backlog Item 1103111: Performance investigation of token cache in confidential client applications when singleton which has all the details and a repro to a solution
Expected behavior
Actual behavior
o(users)
@bgavrilMS @jennyf19 @henrik-me
Investigation for the token cache accessors:
The 4 access in the case of a new token are:
1) Try to look into the cache based on the token from the Web API (from its hash really).
2) it's a new token so does not find it.
-- gets the token using On behalf of flow
-- lock the cache for modification
3) read the cache the cache key is still the hash, but now we know the identity of the user as we got it with the token for the downstream API. We re-read the cache as it could have been modified (by another app) since 1)
4) write the cache with the new token
-- unlock the cache for modification
Could it be possible, in case we know a service does not share the cache, to avoid 3) re-reading the cache @bgavrilMS ?
It should possible to have a cache strategy where an app has exclusive access over the cache. I believe MSAL.py used to have this strategy by default. It
Problem is, if you scale your web app to use multiple servers, how can you guarantee one server has exclusive control over a set of users? You cannot.
I think a better approach is to use a cache with 2 layers, smth fast and small (in memory) and smth slower but with bigger capacity (Redis).
@RamjotSingh prioritizing this one.
cc: @bgavrilMS @henrik-me @trwalke @neha-bhargava
cc: @aiwangmicrosoft FYI
@bgavrilMS ADAL has that sweet spot though right? IIRC It assumes it has exclusive access to cache and that cache overwrites are fine since 2 tokens from AzureAD in the same duration are going to largely be identical. Hence my recommendation, remove the cache locking and unlocking. The steps become
Can we have similar behavior in MSAL?
@bgavrilMS @jmprieur : not really sure what we should do here. Are we only thinking about locking? or ware we also thinking about L1/L2 caching? Additionally are we considering options that prevents writing the entire cache (imo: that seems to be a better option especially for services)? Perhaps it should be an opt. in.
Also marking this as an enhancement rather than a bug. It's by current design, even though it's not a great design.
Let's explore a better design for the cache.
Don't we want to try to honor the "don't reload" flag?
Needs more detail on how we want to handle this.
I synced w/ @jmprieur on this, and I tried on the desktop app, making sure just one PCA was used, I signed in w/three users and was not able to repro it. I also tried w/the customer provided code and could not repro it there either.

so, i would say this has either been fixed or is a result of not using the cache correctly.
The issue talks about CCA and not PCA, not sure if that makes a real difference? Also only code changes we have had around cache afaik would be the latest performance improvement.
@henrik-me. I synced with Jenny and she tried with CCA too (on the customer code).
Looking more at it, what happened when I created this issue is:
The number of users was not growing (5). This behavior is the expected one (but I didn't know at that time) as the key used by MSAL for OBO is the token hash, so new incoming tokens produce new OBO tokens, and at that time we did not have the SuggestedCacheKey property in the TokenCacheNotificationArgs, nor guidance on what to do and even MIcrosoft.Identity.Web was not doing the right thing.
Now that the SuggestedCacheKey is right for each case (that is is the token hash for OBO), the size of each token cache does not grow any longer. It's one user+AT+RT. Now of course the number caches grows.
Now there might be something that we could do: help deleting the caches (one cache per incoming token in the case of OBO), when the OBO token expires (as we cannot know what's the expiry of the incoming token to the Web API). I'm not sure we have the logic for this in MSAL today, we rely on the serialization to manage the expiry, but it does not know the optimum value. Therefore if we passed-in in the TokenCacheNotificationArgs a cache expiry, this could help the serialization layers removing the caches.
proposing to close this long due issue as non-repro (any longer), and to raise a different issue to investigate what we could do to have a better synergy between MSAL expiry and the serialization. BTW this is a topic which interest our customers a lot: we had several issues in Microsoft.Identity.Web about that. See for instance Question - Recommended expiration setting #786
Given the issue originally reported was a usability issue, I don't think we should do anything immediately to address this. My larger question is though if we should invest in making a better caching api for confidential client applications. the current one has a lot of overhead and complexities.
Most helpful comment
Given the issue originally reported was a usability issue, I don't think we should do anything immediately to address this. My larger question is though if we should invest in making a better caching api for confidential client applications. the current one has a lot of overhead and complexities.