Quarkus: provide NoCache for PanacheEntity queries

Created on 11 Dec 2019 · 24Comments · Source: quarkusio/quarkus

Description
Using PanacheEntity to retrieve entities from the database needs at least JPA first-level-cache. We can

circumvent the first-level-cachein jpa by using @NamedQuery (https://github.com/quarkusio/quarkus/issues/3483)
@NamedQuery can be used with fetchGraph to avoid proxies and jpa lazyInitExceptions
we can also improve that by using setHint with org.hibernate.readOnly-property or typedQuery.setHint(QueryHints.HINT_READONLY, true)
further improvements can be done by setting javax.persistence.cache.retrieveMode or javax.persistence.cache.storeMode,
maybe other improvements and tricks can be done

All those improvements and tricks has the aim to have something with the following characteristics:

entities does not require an active Hibernate Session
are detached by default
are always up to date (refreshed objects) on retrieve if data is changed in the database
is read-only
stays only in java memory space (no proxy)
..

It is something like a no-level-cache or zero-level-cache which is usable for a vast amount of data. Imagine millions of entities open a new _Hibernate Session_ only to synchronize related objects with the database at once - with something like a no-level-cache it can be improved.

Implementation Ideas
Maybe the no-level-cache could be added to each panache query method, for example:

public static List<Person> findAlive(){
     // returns the entities as `zero-level-cache` with the performance benefits described above
     return list("status", Status.Alive).noCache();
}

To attach the (detached, read-only) entities from a no-level-cache to a JPA 1st-level-cache, PanacheEntity could introduce an additional method attach() on query methods:

public static List<Person> attacheAlivedPersons(){
    // will attach detached entities 
     return findAlive().attach();
}

arepanache kinenhancement triagwontfix

Source

nimo23

Most helpful comment

Well, it really depends on how often this feature will be useful. listAll(Sort?) and list(String, Sort?, Params?) already have optional params and overloads.

If this is not so common I'd stick it on the query object like we do for lock type and hints.

FroMage on 8 Jan 2020

👍2

All 24 comments

cc @FroMage

geoand on 12 Dec 2019

I deflect all caching topics to @Sanne ;)

FroMage on 12 Dec 2019

😄1

ok, I think this is a reasonable request, but this is not something for Hibernate ORM to solve as we already offer various APIs to do this.

@FroMage I believe the request is to use unmanaged entities, this would likely need a different API in Panache.

Especially if relations can't use proxies, then the load operations will need to be able to define a load graph; that's a big change.

A simpler solution for users could be to inject the Session, set it to read-only and disable automatic flush. That will still have proxies but none of the associated performance costs.
It would not guarantee "always freshly loaded" data, but when users control the Session explicitly they can close it or clear it as necessary.

Sanne on 12 Dec 2019

@Sanne @FroMage we already discuss _read only_ transactions that will do the same optimization as part of https://github.com/quarkusio/quarkus/issues/2997

So we can add a readOnly attribute to the @TransactionConfiguration annotation that will set the underlying Hibernate session to readOnly and the flush type to FlushType.MANUAL.

In our current implementation, this annotation will only be used if the user also annotate it's method with @Transactional and the interceptor that manages it is on narayana side so not capable of using any Hibernate API.

One solution could be to add an interceptor on Hibernate extension side to deals with the readOnly attribute.
Another one to use JPA hints.

loicmathieu on 3 Jan 2020

@loicmathieu good points, I like were your proposal is going but it's good optimisations for a different thing than what is being asked here if I understand correctly?

This issue seems to be around asking for a simpler, "stateless" access pattern which is more similar to ROR. I believe it's a reasonable request, but the pattern is so very different than when using Hibernate / JPA that I don't think the two "ways" should be enabled from the same API or at least not beyond the Stateless Session API we already provide.
@maxandersen maybe would like to comment?

In short, I believe that while such an approach has merits, if someone wants to introduce it it should not be based on Hibernate ORM - it's not a good fit, as for example you just can't materialize complex graphs without implicitly introducing some form of "first level cache". Personally I prefer the Hibernate ORM / JPA approach but we understand it's not a silver bullet, and there's good reasons to allow a variety of approaches.

In @nimo23 's proposal there's a notion for re-attaching detached entities; I'd have to think more about that, but my instinct tells that would be very tricky to allow inter-mixing of some APIs to behave in detched mode, others not - especially if they are "managing" (or not!) the same types, that risks introducing both a lot of confusion and some more tricky semanthics that would need to be defined.
N.B.: the Hibernate Stateless session doesn't allow to load relations, however it allows the user to _explicitly_ customize such things via various other APIs (projections, transformations). I believe it's good that the user is forced to explicitly spell out what do to, how to load a relation, etc..

Incidentally as I mentioned, to support "no cache mode" you'd be limited to very simple objects with no relations: you can't re-attach multiple representations (multiple instances) of the same data, and you'd have multiple of those if you have any relation - unless you re-introduce some form of state to perform de-duplication on load, or just prevent all forms of -to-many relations from being loaded altogether.

Would be best to think such things through - personally it's a no-go from me to mix those approaches on the same model / same API, so I'm inclined to close this one but we can discuss a bit further first :)

@nimo23 WDYT? If you can think of how to alleviate my concerns, or perhaps think of a clear-cut different API than the current stateful one, happy to consider that.

Sanne on 3 Jan 2020

This issue seems to be around asking for a simpler, "stateless" access pattern

yes, the "zero level cache" is more than only read-only-queries. It is not exactly the counterpart of ORMs "First level cache" (which is enabled by default and cannot be disabled in regular) but more a (memory and speed) performant way to retrieve the queries from PanacheEntity by bypassing the most orm heavy things (like the things described at the beginning).

If you can think of how to alleviate my concerns, or perhaps think of a clear-cut different API than the current stateful one, happy to consider that.

@Sanne At the time, I dont know, I thought the switch from cached to zero-cached query results can be provided within PanacheEntity. I must think about it..

nimo23 on 3 Jan 2020

@Sanne I will open an issue with my design proposal for "read-only transaction".

@Sanne @nimo23 there is a proposal to create a MyBatis extension: #1958. MyBatis is lighter than Hibernate as it just maps a request to a Pojo so maybe it is more suitable for what is asked here ...

loicmathieu on 6 Jan 2020

this sounds like a job for hibernate stateless session... that is more or less one-for-one the same as how mybatis works (afaik).

maxandersen on 6 Jan 2020

👍1

@nimo23 what is your expectation on what happens when you navigate relationsships that was not fetched in this "no-cache-world" ? Do you expect them to try attempt to try load or throw an error?

maxandersen on 6 Jan 2020

this sounds like a job for hibernate stateless session

Thanks. Yes, I did not know about the existence of a StatelessSession in Hibernate. This would match most of the requirements of the "zero-level-cache". I like the approach to use StatelessSession instead of myBatis because with this we can stay in jpa/hibernate world.

when you navigate relationsships that was not fetched in this "no-cache-world"

Good question, with StatelessSession collections are ignored. Maybe, PanacheEntity can provide a comfortable way to retrieve entities by a StatelessSession and collections could be prefetched by default (like @NamedEntityGraph).

For example:

public static List<Person> findAlive(){
     // returns the entities by stateless session
     return list("status", Status.Alive).noCache();
}

The user can execute a @NamedEntityGraph and has also something like a StatelessSession but providing StatelessSession pattern within PanacheEntity would be a good alternative.

@Sanne do you think that StatelessSession access can be within by PanacheEntity? Actually, queries from PanacheEntities cannot accessed by StatelessSession.

nimo23 on 6 Jan 2020

do you think that StatelessSession access can be within by PanacheEntity?

I guess we could introduce a simple method like ".loadStateless( id )" ?
We could consider having this helper only on simple types which don't do relations; or have it on all types but throw an exception when a relation is accessed.

Regarding the return list("status", Status.Alive).noCache(); : I'm not sold on calling this "noCache" as it might be unclear - in many situations the Session isn't experienced as a cache but rather as the "operation's context". The name "cache" often refers to the 2nd level cache, as it delegates to an actual caching library.

But the idea of allowing some modifiers on the list seems doable; to be implemented though we should be able to apply the "no session" option before the list operation is performed, so I suspect the API would need to look different. Not least because list("status", Status.Alive) already returns a List, so we'd need a different intermediate type, such as PanacheQuery.

Something like:

public static List<Person> findAlive(){
    // returns the entities by stateless session
    return find("status", Status.Alive).stateless().list();
}

an issue I see is that then it becomes tempting to duplicate all existing helpers to provide a stateless verssion of each - such as findById -> findByIdStateless and this risk becomeing a bit messy.

Perhaps best to scope first?

Person.stateless().findById()
Person.stateless().list()

Sanne on 6 Jan 2020

👍1

yes, Entity.stateless().list(). With such a stateless pattern, there is no need to use additionally orm alternatives, such as myBatis, Jooq, reQuery, etc. when it comes to performance and overhead. The only problem is that StatelessSession cannot access relationships. This makes it less usable in compare to custom @NamedEntityGraph or @NamedQuery.

Thus, I dont know if this issue is worth to implement.

When https://github.com/quarkusio/quarkus/issues/3483 supports @NamedEntityGraph or @NamedQuery within find(query), users can also have the benefits of a stateless pattern within PanacheEntity. I am unsure, but I guess, the result of such queries has the same effect of querying by StatelessSession. If you think, that this issue is not worth to implement, I will close it. I like the approach to define stateless queries by Entity.stateless().list() or Entity.list(String query, boolean stateless) but I dont know if this is a better alternative as using @NamedEntityGraph or @NamedQuery directly. WDYT?

nimo23 on 6 Jan 2020

The only problem is that StatelessSession cannot access relationships. This makes it less usable in compare to custom @NamedEntityGraph or @NamedQuery.

stateless session can query relationships - thus I'm pretty sure @namedentitygraph and @namedquery are feasible. As long as the relationships can be fetched in one query/session. it is just not possible to navigate/query the object graph afterwards - which is why I asked what your expecations was when accessing relationships ...since if you don't want proxies at all, the answer will basically be return null or empty collections; or a middleground is to have proxies but have them fail/error when accessed.

maxandersen on 7 Jan 2020

since if you don't want proxies at all, the answer will basically be return null or empty collections;

If user defines a @NamedEntityGraph and omits the relationship x, then accessing x afterwards leads to fail/error. That`s the right approach. Having null or empty collection implies there exists no relationship which is misleading in such cases.

@Sanne I am unsure if this is worth to implement because with https://github.com/quarkusio/quarkus/issues/3483 I get such stateless pattern in PanacheEntity also. Are there any benefits in having Entity.stateless().list() in compare to Entity.list(myNamedQuery)? WDYT?

nimo23 on 7 Jan 2020

Are there any benefits in having Entity.stateless().list() in compare to Entity.list(myNamedQuery)?

A _named_ query isn't loaded via a stateless session, so any loaded entities would still be _managed_. Maybe I misunderstood but I thought you wanted the super-light experience from raw records to bypass the Session's statefulness?

Sanne on 7 Jan 2020

A named query isn't loaded via a stateless session, so any loaded entities would still be managed.

@Sanne yes, then it makes sense to have something like Entity.stateless().list() to have the super-light experience :)

nimo23 on 7 Jan 2020

👍1

I think to have an overloaded method for Entity.list() with an optional property isStateless or isStatelessQuery would be better:

Entity.list(boolean isStateless) better than Entity.stateless().list() ?

nimo23 on 7 Jan 2020

That might work, but I'm not sure how far it scales if we're to add several more flags and hints.

@FroMage WDYT ? you're the master of usability and it's your API :)

BTW to be clear I won't be able to work on implementing this any time soon unfortunately, I have a lot on my plate. Any volunteer?

Sanne on 7 Jan 2020

Well, it really depends on how often this feature will be useful. listAll(Sort?) and list(String, Sort?, Params?) already have optional params and overloads.

If this is not so common I'd stick it on the query object like we do for lock type and hints.

FroMage on 8 Jan 2020

👍2

Well, it really depends on how often this feature will be useful.

The use of Hibernate StatelessSession is a good alternative to things like myBatis, Jooq, reQuery, etc when it comes to performance.

nimo23 on 12 Jan 2020

@nimo23 support for @NamedQuery is on it's way via #8071 and _read only transactions_ are investigated via #7455

Does these two issues cover your needs?
Can we close this one?

loicmathieu on 2 Apr 2020

Hi @loicmathieu thank you. To sum up,

PanacheEntities is able to

have read only transactions
have @NamedQuery

This issue also asks about:

no proxy (but I think such future of "no-proxy" can be ignored, or are there any big performance benefits in having no entity proxies?)
detached entities with StatelessSession (with https://github.com/quarkusio/quarkus/issues/6095#issuecomment-571289001)

Would be good if Panache has possibility to query entities with StatelessSession (loading lots of data or doing bulk operations with StatelessSession is better in terms of performance and memory consumption in compare to using no stateless session):

Having something like Entity.list(boolean isStateless) or what @FroMage suggested:

listAll(Sort?) and list(String, Sort?, Params?) already have optional params and overloads.

If this is not so common I'd stick it on the query object like we do for lock type and hints.

nimo23 on 2 Apr 2020

If this is not so common I'd stick it on the query object like we do for lock type and hints.

Yes, I'll stick it inside PanacheQuery so it will be something like

PanacheEntity.find(query, params).stateless().list()

Can you close this issue and open a new one that ask for support of stateless session inside Panache ?
I don't know if it's an easy task but at least we will track it and by closing this one avoid too broad issues ...

loicmathieu on 2 Apr 2020

👍1

@loicmathieu see https://github.com/quarkusio/quarkus/issues/8348

nimo23 on 2 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings