_Original issue created by richarddowsett on 2011-10-27 at 12:14 PM_
Pre-populating a cache is a common use case for loading static data from a database. This would also be specifically useful where lazy loading is not ideal. Please see http://stackoverflow.com/questions/7915016/pre-load-values-for-a-guava-cache for a specific case where this would be useful.
It would be ideal to create an interface that can pre-load all static data from the database and add it to the Cache at start-up, such as:
CacheBuilder.newBuilder().populate(new CachePopulator<String, Element>(){
@Override
public Map<String, Element> populate() throws Exception {
return getAllElements();
}
}).build(new CacheLoader<String, Element>(){
@Override
public Element load(String key) throws Exception {
return getElement(key);
}
});
_Original comment posted by [email protected] on 2011-10-27 at 09:17 PM_
When we add CacheLoader.loadAll and Cache.getAll in 11.0, that would seem like a cleaner way to do this. First create the empty cache, then just call getAll to fill it up.
_Original comment posted by raymond.rishty on 2011-10-28 at 11:47 AM_
I suppose the objection to that would be that if I have, say, 100 keys, I would prefer to get the 100 corresponding values via a single DAO call that via 100 DAO calls.
_Original comment posted by richarddowsett on 2011-10-28 at 12:07 PM_
I'm assuming this would be the interface for the new feature (or something similar): Cache.getAll(List<K> keys).
As Raymond has pointed out, this would result in keys.size() calls to the database where my solution would allow you to get all values in a single DAO call. This would surely be a better solution where the cache needs to be pre-populated with a large number of elements.
_Original comment posted by raymond.rishty on 2011-10-28 at 01:26 PM_
CacheLoader has loadAll
public Map<K, V> loadAll(Iterable extends K> keys) throws Exception
While Cache has getAll
ImmutableMap<K, V> getAll(Iterable extends K> keys)
According to the Javadoc for CacheLoader, that default implementation of loadAll delegates to individual calls to get. However, the method is not final, and as the Javadoc offers, you are free to override with a bulk lookup in your own CacheLoader.
You might familiarize yourself with the interface: https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheLoader.java
_Original comment posted by richarddowsett on 2011-10-28 at 02:57 PM_
This interface would allow the bulk loading of keys so it would seem to satisfy the requirement.
My only comment would be that if the full set of keys is not known at runtime, you can still override the method and bulk load all data (with one DAO call) but you are breaking the interface because you are not using the parameter keys to load the data. But it would still work the way I expect it to, I'm just trying to get a precise interface out of an API used by many other people :)
_Original comment posted by [email protected] on 2011-10-29 at 04:41 PM_
It's not a _perfect_ fit for this request, but I think that it will do. Note that loadAll() is permitted to return extra entries even whose keys were not asked for, so you could do some hack like just asking for a "magic key" if you don't know what the keys to load are. That said, it's probably cleaner to just create the cache, do your initial population load externally and use asMap().putAll().
_Original comment posted by [email protected] on 2011-11-15 at 10:06 PM_
If the keys are known, loadAll may be the answer; if they aren't, you can still just get the data externally to the cache and put it all in. I don't think there's anything we need to do here, but if I'm missing something big let me know.
Status: WontFix
Labels: Type-Enhancement
_Original comment posted by [email protected] on 2012-01-04 at 11:02 AM_
Well, Kevin, of course both of your use cases meet the functional requirements BUT neither is clean. With the asMap() solution you make the _client_ control the cache. Therefore the cache isn't self contained and responsibility bleeds out to outer classes.
And to answer your question. What you're missing is that the solution to this common use case (of populating the whole cache in one go) is still not present in guava-cache.
And now, my own problem:
In my work place I have an enum-like dictionaries in remote systems I need to cache at startup. That's because fine-grained web service operations return their IDs instead of names, symbols or descriptions. Of course I don't have any control over the services provided to me; and, unfortunately, these provide only one operation to get the whole dictionary at once. So what am I supposed to do?
_Original comment posted by wasserman.louis on 2012-01-04 at 03:58 PM_
Why does the putAll force you to bleed responsibility to outer classes? Do the putAll immediately upon cache creation, right there, so no other classes have to worry about it.
_Original comment posted by [email protected] on 2012-01-04 at 04:16 PM_
May I ask for the example code? Keeping in mind only CacheLoader instance holds a reference to RemoteRepository.
_Original comment posted by wasserman.louis on 2012-01-04 at 04:47 PM_
Rearrange that a bit, is the point. I'm suggesting something like
Cache<Key, Value> constructCache(RemoteRepository repo) {
Cache<Key, Value> cache = CacheBuilder.newBuilder()
.......
.build(new CacheLoader<Key, Value>(){
....
});
cache.asMap().putAll(getInitialEntriesFromRepo(repo));
return cache;
}
_Original comment posted by [email protected] on 2012-01-04 at 08:35 PM_
But that's exactly the case - the Cache LOADER isn't loading the entries now. Outer class does, doesn't it? And it puts the entires into the cache. From outside, via public interface. That's this little minor detail which makes guava cache shine when compared to the standard bunch (ehcache, treecache, etc).
And what if the remote repo changes? I mean, what if remote system adds an entry after the cache is populated? Then the call to get(key) should retrive this new value (or simply refresh the whole set again). Now, you can't do that because you have initialized it from outside, so you need to duplicate the loading parts inside the cache. Ugly.
I know it's hard to implement this with current CacheLoader interface and Cache loading internals. But maybe it could be split into two interfaces? And then people could decide which one to use? Like CacheLoader for single items and FullCacheLoader (or any other fitting name) for the whole set? Just thinking aloud. I'm sure you, guys, are capable of coming up with something even better. Just trying to help.
Anyways, this is a great library. Thank you!
_Original comment posted by [email protected] on 2012-04-06 at 03:10 PM_
Let's keep this open for now. It's also come up at:
http://stackoverflow.com/questions/7915016/pre-load-values-for-a-guava-cache
and
https://groups.google.com/d/msg/guava-discuss/-DHg4Hi4sQE/HmlhQd8nkowJ
Status: Acknowledged
Labels: Package-Cache
_Original comment posted by wasserman.louis on 2012-04-25 at 06:31 PM_
I guess my thinking is as follows:
I sort of assume that in most cases, the cache loader is defined inline, as an anonymous class. In this case, you really don't lose anything by moving the initial population code outside the loader, because it's exactly as exposed as it always was.
But let's assume that's not the case -- that the cache loader is a top-level class. In most cases, a cache loader is used with only one cache configuration. So add a (possibly static) method to your CacheLoader, public LoadingCache<K, V> buildCache(). If you add the population code in that method, then you're set: it looks like
public class MyLoader extends CacheLoader<K, V> {
public V load(K key) {
...
}
private static void populate(LoadingCache<K, V> cache) {
...
}
public static LoadingCache<K, V> buildCache() {
LoadingCache<K, V> cache = CacheBuilder.newBuilder()
...configuration options...
.build(new MyLoader());
populate(cache); // uses putAll
return cache;
}
}
If you use your cache loader with multiple different configurations, then you can either make your method take a pre-configured CacheLoader, or you can even pass in a CacheBuilderSpec (with release 12) that takes care of all the other details.
All of this manages to keep the encapsulation we want without messing with an already excellent abstraction.
_Original comment posted by [email protected] on 2012-05-30 at 07:41 PM_
_(No comment entered for this change.)_
Labels: -Type-Enhancement, Type-Enhancement-Temp
_Original comment posted by [email protected] on 2012-05-30 at 07:45 PM_
_(No comment entered for this change.)_
Labels: -Type-Enhancement-Temp, Type-Enhancement
_Original comment posted by [email protected] on 2012-06-22 at 06:16 PM_
_(No comment entered for this change.)_
Status: Research
_Original comment posted by wasserman.louis on 2012-08-18 at 04:22 PM_
_Issue #1099 has been merged into this issue._
_Original comment posted by [email protected] on 2012-11-09 at 10:15 PM_
It's very unclear what if anything needs to be done here.
Martel, Have you tried the idea outlined on the stack overflow post linked in #13?
If you have a strong need that really isn't being addressed by anything we offer now, please make a new attempt to describe the whole problem (not intended solution) in clear terms. (ie, file a new issue)
Status: WontFix
_Original comment posted by [email protected] on 2012-11-09 at 11:42 PM_
First of all, Kevin, thanks. Secondly, if you mean the "feature request" from actual Stack Overflow question then it's exactly what I had in mind.
Actually I haven't looked into this issue since I wrote my last message so I'm not sure where we stand now. I simply wanted to encapsulate whole loading logic inside the CacheLoader interface which just seems more in line with what is already there. I don't have any strong feelings for or against it, though.
I can make another issue but I don't see the point if you're not sharing my point of view. It's still a decent API even if I can't pre-populate the cache.
Michał
but what if I need the data be populated repeatedly on eviction time is over? PutAll doesn't handle that. Isn't it?
I am currently in the same situation as above.
I don't know all of the possible keys at runtime, I have a DB table that I do lookups on frequently (>10k requests per minute) the entire DB table could be stored in memory and refreshed on a set interval.
For my use case I had started looking at Suppliers#memoizeWithExpiration however since the cache is so frequently used it would be preferable to have a non blocking refresh at a given interval.
I have raised https://github.com/google/guava/pull/3777 for discussion on an implementation using suppliers but I still wish there was a way to do this in a LoadingCache.
I also have demands for this and like to add another point why I think none of the workarounds really satisfies me:
So what would be need is some kind of "preloadKeys(T ...)" that behaves the following way:
1) if it is passed an empty array, and the cache is empty while accessed by a get operation, CacheLoader.loadAll(Iterable<? extends K>) is called with a null argument, or if that is not desireable, a special (empty) Iterable instance that could be checked e.g. LOAD_ALL = Collections.EMPTY_LIST
2) if it is passes a non empty array and any of the kexs is accessed, CacheLoader.loadAll(Iterable<? extends K>) is called instead of simple load
Most helpful comment
but what if I need the data be populated repeatedly on eviction time is over? PutAll doesn't handle that. Isn't it?