Quarkus: spring-data-jpa: repository calls merge() when it should call persist() instead

Created on 22 Jun 2020  Â·  32Comments  Â·  Source: quarkusio/quarkus

Describe the bug
Although we do _not_ work with detached entities, we ran into https://hibernate.atlassian.net/browse/HHH-9362 and/or https://hibernate.atlassian.net/browse/HHH-11414 for a rather simple persist scenario.
In the call stack we can see that the generated repository is calling merge() instead of persist().
This is most likely due to this code in StockMethodsAdder:

// call EntityManager.persist of the entity does not have it's ID set, otherwise call EntityManager.merge

BytecodeCreator idValueUnset;
BytecodeCreator idValueSet;
if (idType instanceof PrimitiveType) {
    BranchResult idValueNonZeroBranch = save.ifNonZero(idValue);
    idValueSet = idValueNonZeroBranch.trueBranch();
    idValueUnset = idValueNonZeroBranch.falseBranch();
} else {
    BranchResult idValueNullBranch = save.ifNull(idValue);
    idValueSet = idValueNullBranch.falseBranch();
    idValueUnset = idValueNullBranch.trueBranch();
}

idValueUnset.invokeStaticMethod(
        MethodDescriptor.ofMethod(JpaOperations.class, "persist", void.class, Object.class),
        entity);
idValueUnset.returnValue(entity);

ResultHandle entityManager = idValueSet.invokeStaticMethod(
        ofMethod(JpaOperations.class, "getEntityManager", EntityManager.class));
entity = idValueSet.invokeInterfaceMethod(
        MethodDescriptor.ofMethod(EntityManager.class, "merge", Object.class, Object.class),
        entityManager, entity);
idValueSet.returnValue(entity);

We _do_ pre-populate our IDs (custom strongly typed wrappers around UUID, e.g. UserId) so the generated code seems to think we have an update case of a detached entity, which is normally the only case when you really _need_ merge().
JavaDoc of EnityManager.merge():

Merge the state of the given entity into the current persistence context.

Expected behavior
Quarkus spring-data-jpa should only call merge() when really _required_, that is from my POV: merging a detached entity.
With a pre-populated id, the initial persist should be performed with persist(), not merge().

Actual behavior
Quarkus spring-data-jpa calls merge():

  • in a persist case when Id is pre-populated
  • in all update cases (Id is present)

To Reproduce
Steps to reproduce the behavior:

  1. Will create a reproducer if required but I am pretty swamped right now.

Configuration
n/a

Screenshots
n/a

Environment (please complete the following information):

  • Output of uname -a or ver:
    MINGW64_NT-10.0-18363 XXX 3.0.7-338.x86_64 2019-11-21 23:07 UTC x86_64 Msys
  • Output of java -version:
    openjdk version "11.0.7" 2020-04-14 OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.7+10) OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.7+10, mixed mode)
  • GraalVM version (if different from Java): n/a
  • Quarkus version or git rev: 1.4.2.Final
  • Build tool (ie. output of mvnw --version or gradlew --version): Apache Maven 3.6.3

Additional context
The original spring-data-jpa is more or less doing the same, but it is also checking the value of the @Version column:
https://github.com/spring-projects/spring-data-jpa/blob/2.3.1.RELEASE/src/main/asciidoc/jpa.adoc#entity-state-detection-strategies
See also:

On a side note: I don't know why the docs say

Option 1 is not an option for entities that use manually assigned identifiers

because the code clearly shows that the Id-check is skipped when a version column is present.

This would help in our case for the inital persist, but it would still use merge() for updates which means we might run into HHH-9362 and it would be "incorrect" since our entities are already attached when they are updated.

Alternatively, org.springframework.data.domain.Persistable.isNew() would also help, but again only for the intial persist. The interface is present in Quarkus spring-data-japa, but the method is not called.

What I am proposing here for the update case actually goes futher than what Spring is doing which might justify a configuration property to control that behaviour:
Only use merge() if EntityManager.contains() returns false.

arespring kinbug

Most helpful comment

Then we should turn it around: Call merge() unless contains() returns true.

I'm still confused :) That's going to break both merge and persist use cases:

It's a typical monday for me so I am probably confused too. ;-)

Later today I'll try to write down the rules I am thinking of. So maybe I'll come to the same conclusion eventually.

All 32 comments

/cc @geoand

/cc @Sanne (mainly for the Hibernate bugs and a broader view on the general topic)

/cc @tkalmar & @harthorst (team members)

Thanks for the analysis.

When we added this implementation, we indeed were following what regular spring-data-jpa was doing.

If @Sanne agrees that what you propose

What I am proposing here for the update case actually goes futher than what Spring is doing which might justify a configuration property to control that behaviour:
Only use merge() if EntityManager.contains() returns false.

is the proper thing to do, then yeah, I'd be willing to add a configuration property to change that would allow changing the behavior.

@geoand

What I am proposing here for the update case actually goes futher than what Spring is doing which might justify a configuration property to control that behaviour:
Only use merge() if EntityManager.contains() returns false.

is the proper thing to do, then yeah, I'd be willing to add a configuration property to change that would allow changing the behavior.

:+1: but please note that this is only working for updates! We'd still need that @Version handling (preferred) or Persistable.isNew().

I admit I don't know what the version thing does, but I don't think it would be a problem to add support for whatever it is :P

I agree with most of what @famod expects - except I'm confused by the last suggestion:

Only use merge() if EntityManager.contains() returns false

Should it not use persist for newly created objects which are to be persisted? They will not be managed yet, so they will not be contained by the current state of the EM.

I don't know what Spring does, nor what our spring-data-jpa modules do, but would recommend to clearly expose these as separate methods.

There's a reason we don't try to be smart in Hibernate: people should know what they expect to be used here.

@geoand

I admit I don't know what the version thing does

AFAICS the logic is reather simple: If the @Version field is present and its value is not yet set: use persist().

To clarify:

  • use persist() for new objects which need to be persisted
  • use merge() to update objects which are expected to exist (see its javadoc for more precision)

There's no safe way to be able to automatically disinguish the two use cases. Especially, it's not safe to check for the ID being set.

I don't know what Spring does, nor what our spring-data-jpa modules do, but would recommend to clearly expose these as separate methods.

spring-data-jpa just exposes save which is expected to work for both new and existing entities

@Sanne

Should it not use persist for newly created objects which are to be persisted? They will not be managed yet, so they will not be contained by the current state of the EM.

Yeah ok, it is not just about contains():

  • if version and/or id is set
  • and if contains() returns false: call merge()

PS: Persistable.isNew() has to be thrown into the mix as well, _if_ it is used at all.

Ah sorry, missed that part:

so they will not be contained by the current state of the EM.

That's of course true. Then we should turn it around: Call merge() _unless_ contains() returns true.

@famod do you want to right down the final rules so there can be full clarity on what we are talking about?

Then we should turn it around: Call merge() unless contains() returns true.

I'm still confused :) That's going to break both merge and persist use cases:

  • it's legal to _merge_ both detached and managed objects
  • _persist_ is always invoked on a non-managed object, it makes it managed (and assigns an Id if necessary)

Then we should turn it around: Call merge() unless contains() returns true.

I'm still confused :) That's going to break both merge and persist use cases:

It's a typical monday for me so I am probably confused too. ;-)

Later today I'll try to write down the rules I am thinking of. So maybe I'll come to the same conclusion eventually.

First, the rather clear intial persist case (just as the original spring-data does it):

Call persist() (in that order):

  • if a @Version field is present _and_ is not yet set
  • or the id field is not yet set
  • otherwise call merge()

That would cover the case of custom id types + @Version field.

Persistable.isNew() could (or maybe should?) be added later for the folks using custom id types but _not_ @Version. For me and my team this is only the second best solution since this means more work on the user side than just relying on @Version (which we have anyway).

Secondly, I had a small debugging and testing session with a Quarkus test (1.4.2.Final) and this confirmed what I have seen in older projects with older Hibernate versions:
You _can_ call persist() with already existing entities, as Hibernate will render an UPDATE for that.

So in case we extend the list from my previous comment as follows:

Call persist() (in that order):

  • if a @Version field is present _and_ is not yet set
  • or the id field is not yet set
  • or the entity is _not_ contained in the session <- new condition!
  • otherwise call merge()

then the only "side-effect" (AFAICS) is that an entity that is to be updated which has _not_ been loaded into the session is updated via persist().
The other way around (call persist() if contains() returns true) doesn't work as Hibernate will complain with an exception that the entity is already bound to the context.

This approach would lead to merge() only if the entity is already present in the session which is maybe also a bit strange.

IDK, I'll leave it to you guys for now.

So I assume that as a first step we should just add the @Version support that Spring Data proper has.

So I assume that as a first step we should just add the @Version support that Spring Data proper has.

Is this @Version a custom semantics or is it the JPA javax.persistence.Version ? If it's just the JPA one then we obviously support it in Hibernate.

I'm not too fond of this plan overall though. The semantics of this are complex and confusing, and there's still plenty of cases in which the operetion will not do what the intention might have been.

It seems like there _might_ be a reasonable solution in the case of versioned entities, or for entities for which the Id is NOT assigned by the end user; could we leverage the build-time analysis features of Quarkus to expose this feature exclusively for such entities which make this feature a safe choice?

@Sanne

So I assume that as a first step we should just add the @Version support that Spring Data proper has.

Is this @Version a custom semantics or is it the JPA javax.persistence.Version ? If it's just the JPA one then we obviously support it in Hibernate.

Default JPA, it seems:
https://github.com/spring-projects/spring-data-jpa/blob/2.3.1.RELEASE/src/main/java/org/springframework/data/jpa/repository/support/JpaMetamodelEntityInformation.java#L122

I'm not too fond of this plan overall though. The semantics of this are complex and confusing, and there's still plenty of cases in which the operetion will not do what the intention might have been.

I see your point. It depends on how similar Quarkus' spring-data-jpa should be to the original one.
I do see good reasons to strife for equality but maybe without inheriting too many questionable approaches.

Ok, it had forgot that the goal of this module is compatibility with the spring-data API.

Maybe a good middleground is to implement the reasonable conditions that @famod suggested, but:

  • check for the model compatibility at build time
  • warn explicitly about this being a rather bad idea, when the model is not compatible with such a strategy

I'll let @geoand decide :)

Yeah that sounds pretty reasonable to me - follow what Spring Data JPA does but also add proper warnings when we know things might not work that well.

Btw, as long as Persistable.isNew() is _not_ evaluated there should be at least a hint here: https://quarkus.io/guides/spring-data-jpa#what-is-currently-unsupported

I am not sure whether a warning could be implemented because Persistable.getId() is already in use.

Whoever is taking a stab at this: Would this be a candidate for 1.6.1/.2 or is it 1.7.0 only?

PS: I suppose 1.6.0 is a no-go.

I doubt I'll have time for 1.6, but you never know

On Tue, Jun 23, 2020, 19:36 Falko Modler notifications@github.com wrote:

Whoever is taking a stab at this: Would this be a candidate for 1.6.1/.2
or is it 1.7.0 only?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/quarkusio/quarkus/issues/10156#issuecomment-648277938,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABBMDP7JKJHPQAO74I4IFPLRYDKY3ANCNFSM4OEQNJSA
.

Actually, I'll take a look at adding support for @Version as this seems to be used quite widely and as already mentioned it is what regular spring-data-jpa does.

But I don't think I'll be doing anything more for this ticket unless we find a really good reason to

@geoand Are you going to extend https://quarkus.io/guides/spring-data-jpa#what-is-currently-unsupported as well?

There could be a link to the spring docs ("Entity State-detection Strategies") and some notes what is not (yet) covered or is covered differently.

10211 adds support for checking @Version and if it exists, using it to decide which EntityManager method to call.

@geoand Are you going to extend https://quarkus.io/guides/spring-data-jpa#what-is-currently-unsupported as well?

There could be a link to the spring docs ("Entity State-detection Strategies") and some notes what is not (yet) covered or is covered differently.

I am not sure what you want to add in the docs. Assuming we have #10211, what else would be missing from our spring-data-jpa stuff WRT state detection?

@geoand

I am not sure what you want to add in the docs. Assuming we have #10211, what else would be missing from our spring-data-jpa stuff WRT state detection?

Persistable.isNew()

Now I see what that does... I admit had never seen that one in the wild. We can surely support it, can you open a new issue for that?

@geoand

I admit had never seen that one in the wild.

Me neither and my team wouldn't use it as we have @Version which (after your PR) will guarantee a nice OOTB experience.

We can surely support it, can you open a new issue for that?

Done: #10231

Does it make sense to document the 1.6.0 status quo nonetheless? I could do that tomorrow.

Sure, that would be great

Was this page helpful?
0 / 5 - 0 ratings