Orm: Allow to not Compare Object Types by reference

Created on 10 Dec 2015  ยท  18Comments  ยท  Source: doctrine/orm

If we have a Doctrine DBAL type with an object (DateTime, any custom object), then its compared by reference and we cannot use mutable objects.

We should consider allowing a way to have mutable objects here and delegate the changeset compuutation to a service or do something clever.

Most helpful comment

I have a couple custom types, and it a bit annoying ๐Ÿ˜„ , so I got 2 ideas:

1) Since all mapping types comes from Doctrine DBAL, then compassion should be done also by Doctrine\DBAL\Types\Type.

Maybe can just add one method to the Doctrine\DBAL\Types\Type, which by default use strict === compassion, and for custom Type can be overridden.

// Default
public function compareValues($old, $new){
  return $old === $new;
}

// For DateTime
public function compareValues($old, $new){
  return (string) $old === (string) $new;
}

then UOW just do $type->compareValues($old, $new);

Is it expensive?

2) Another way, check whether we can cast object to string, and compare strings:

$isDifferent = is_object($old) && method_exists($old, '__toString') 
  ? (string) $old === (string) $new
  : $old === $new;
// Of course need to check that both $new and $old is not empty

Which is less expensive?

The first one, is more flexible, and gives some freedom for developers, but require changes in both DBAL and ORM. Second one is more simple (can be done only in ORM side), but it also less flexible.

All 18 comments

I proposed comparing the DB-side representation in the past, but it may be
expensive...
On Dec 10, 2015 11:23, "Benjamin Eberlei" [email protected] wrote:

If we have a Doctrine DBAL type with an object (DateTime, any custom
object), then its compared by reference and we cannot use mutable objects.

We should consider allowing a way to have mutable objects here and
delegate the changeset compuutation to a service or do something clever.

โ€”
Reply to this email directly or view it on GitHub
https://github.com/doctrine/doctrine2/issues/5542.

@Ocramius maybe we can enable this comparison based on a setting, so its only done for fields where its necessary.

I expect that if someone has something like a DateTime or ColorCode, they will _always_ want it to be compared via the same mechanism, regardless of which entity it appears on. So then it's the type of the field which implicitly determines the comparison-rules.

How about making it so that an EntityManager always has an, uhm, EntityScalarComparator entities, which it asks for answers about "did this change" query, even when both inputs are strings. An additional PHP method-call in the stack is a small price to pay for separating out the complexity.

Then EntityScalarComparator is where you might , say, have an option to change how DateTime is compared, or a mechanism to chain/overlay another comparison service to support your own custom objects.

@Ocramius maybe we can enable this comparison based on a setting, so its only done for fields where its necessary.

Possibly, but I'd still look at it after @guilhermeblanco is done with digging through JPA's field mappings.

Then EntityScalarComparator is where you might , say, have an option to change how DateTime is compared, or a mechanism to chain/overlay another comparison service to support your own custom objects.

The idea is to generate specific comparators per entity (you pretty much always compare entity to entity, not just single fields).

The idea is to generate specific comparators per entity (you pretty much always compare entity to entity, not just single fields).

I interpreted @beberlei as describing a different scenario, where there is only one Doctrine Entity, and someone has used a custom DBAL mapping type for a specific field on it. Then the issue is with how the ORM does dirty-checking on each of the fields.

For example, suppose someone creates a BitMask DBAL type that saves/loads from an SQL integer column. There are two possible failure-cases:

  1. If the BitMask instance is mutable, then the ORM might fail to commit changes, because as far as it is concerned the field-object always === itself. That's why we "cannot use mutable objects".
  2. If the BitMask (mutable or otherwise) is replaced by another instance which happens to have _identical content_, then the ORM will consider the Entity dirty, even if $mask1->equals($mask2)

The underlying behavior is like:

if($oldVal !== $curVal){
    // Yes, add to computed changeset
}
// No, continue comparing other properties

But for that particular class of field-representing-object, the designer might want custom behavior more like:

if($oldVal !== $curVal){
    if($helper->areTheyEquivalent($oldVal,$curVal)){
        // Yes, add to computed changeset
    }      
}else{
    if($helper->isItMutableAndDirty($oldVal)){
        // Yes, add to computed changeset
    }
}
// No, continue comparing other properties

I have a couple custom types, and it a bit annoying ๐Ÿ˜„ , so I got 2 ideas:

1) Since all mapping types comes from Doctrine DBAL, then compassion should be done also by Doctrine\DBAL\Types\Type.

Maybe can just add one method to the Doctrine\DBAL\Types\Type, which by default use strict === compassion, and for custom Type can be overridden.

// Default
public function compareValues($old, $new){
  return $old === $new;
}

// For DateTime
public function compareValues($old, $new){
  return (string) $old === (string) $new;
}

then UOW just do $type->compareValues($old, $new);

Is it expensive?

2) Another way, check whether we can cast object to string, and compare strings:

$isDifferent = is_object($old) && method_exists($old, '__toString') 
  ? (string) $old === (string) $new
  : $old === $new;
// Of course need to check that both $new and $old is not empty

Which is less expensive?

The first one, is more flexible, and gives some freedom for developers, but require changes in both DBAL and ORM. Second one is more simple (can be done only in ORM side), but it also less flexible.

Both are extremely expensive, since an additional set of method calls per field per value per stored UoW change is to be performed. This is going to massively affect ORM performance, and can only be done if the checks can be compiled into the UoW.

okay, more complicated than I thought

can only be done if the checks can be compiled into the UoW

~what it means? sorry, not very understood~
hm, okay I think I understood

(string) $old === (string) $new

yeah that can be more expansive, because __toString can contain extra complex logic,
then Doctrine\DBAL\Types\Type looks more simple betwen these both, but performance still depend from what developer put inside Doctrine\DBAL\Types\MyCustomType::compareValues().

hm, okay, not perfect

I have made tests.
The full source of the test is there https://github.com/Fedik/doctrine-changeset-test
_(all time in seconds)_

default

Dummies 10000 items
Make dummies: 3.6029109954834
Compute changes: 0.1790030002594
Recompute changes: 0.22904109954834

method_exists($orgValue, '__toString')

Dummies 10000 items
Make dummies: 3.6226229667664
Compute changes: 0.18751001358032
Recompute changes: 0.23972296714783

_note: DateTime does not have __toString so it ignored here_

Doctrine\DBAL\Types\Type::isValuesIdentical($val1, $val2)

Dummies 10000 items
Make dummies: 3.6306829452515
Compute changes: 0.22055912017822
Recompute changes: 0.24937987327576

The simple objects like Point do not make huge difference, it around the same with default.
But more complex Objects can take some more time, that true.

From my point of view Doctrine\DBAL\Types\Type::isValuesIdentical($val1, $val2) is good enough to accept (but still not perfect ๐Ÿ˜‰ )

if I have missed something in my test, please tell me.

Hey Fedir,

Please use phpbench when writing tests, as it considers standard deviation
and other factors.

I see a 23% slower API there - that's not gonna be OK.

Marco Pivetta

http://twitter.com/Ocramius

http://ocramius.github.com/

On Wed, Nov 8, 2017 at 9:15 PM, Fedir Zinchuk notifications@github.com
wrote:

I have made tests.
The full source of the test is there https://github.com/Fedik/
doctrine-changeset-test
(all time in seconds)
default

Dummies 10000 items
Make dummies: 3.6029109954834
Compute changes: 0.1790030002594
Recompute changes: 0.22904109954834 <02290%204109954834>

method_exists($orgValue, '__toString')

Dummies 10000 items
Make dummies: 3.6226229667664
Compute changes: 0.18751001358032
Recompute changes: 0.23972296714783 <02397%202296714783>

note: DateTime does not have __toString so it ignored here
Doctrine\DBAL\Types\Type::isValuesIdentical($val1, $val2)

Dummies 10000 items
Make dummies: 3.6306829452515
Compute changes: 0.22055912017822 <02205%205912017822>
Recompute changes: 0.24937987327576 <02493%207987327576>

The simple objects like Point do not make huge difference, it around the
same with default.
But more complex Objects can take some more time, that true.

From my point of view Doctrine\DBAL\Types\Type::isValuesIdentical($val1,
$val2) is good enough to accept.

if I have missed something in my test, please tell me.

โ€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/doctrine/doctrine2/issues/5542#issuecomment-342945485,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAJakBufPuBWKlnXk1arnQ6ziGktAqBYks5s0gvegaJpZM4Gym55
.

Well, this an interesting ticket from the past. Looking at my previous post, am I correct in assuming that the use-case we're all talking here about involves checking the dirty/changed status for custom DBAL types that represent a column? In other words, the same problem causing the warning that "DateTime changes are detected by Reference"?

@Ocramius I'm not sure that the performance outlook is all that dire, because in practice the ORM can skip the method call for a majority of entity-properties.

Suppose there's are some new potential methods on Doctrine\DBAL\Types\Type:

  • useCustomDirtyCheck() should always return either true or false
  • isDirty($obj1, $obj2) only matters if custom check is used
  • markClean($obj1) only matters if custom check is used

The output from the first method can easily be cached, so that most cases (ex: StringType, BooleanType, IntegerType) only need the existing strict-quality test.

To sketch it out:

/** @var $type \Doctrine\DBAL\Types\Type **/
/** @var $type_has_custom_check boolean **/
if(!$type_has_custom_check){
    // This if-statement is how UnitOfWork currently operates
    if ($orgValue === $actualValue) { 
        continue;
    }
}else{    
    if ($type->isDirty($orgValue, $actualValue)) {
        continue;
    }
}

So ~95% of the time the only additional cost is checking a boolean value, one that is effectively a constant for the lifetime of the process.

The other 5% of the time, like when using MoneyType, the ORM knows it must delegate the job of figuring out whether the entity-property is dirty to MoneyType::isDirty($orgValue, $actualValue). (Obviously it's up to the designer of Money and MoneyType to make sure both of them work together correctly.)

P.S.: Originally I got stuck thinking of this as an equality test, but it really isn't, because both arguments to isDirty() could easily be the exact same object, and what we really want to know is whether that object was altered since it was loaded from the database. I also added markClean() which would need to be run after successfully updating/inserting the entities.

@Fedik My concern about isValuesIdentical is that while it works for immutable value-objects, it doesn't work for mutable types. Supposing Point was invented to be mutable, someone could go:

$entity->point1->travel($compassHeading, $distance)

And later on the isValuesIdentical() test would get the same object for both inputs, and falsely indicate that nothing needed to be saved.

Note: Issue with mutability also applies to Embeddables, but on entirely different level of implementation. If there's to be some kind of a Comparison API, it'd be good to share the core idea between Types and Embeddables.

Any movement on this issue? Has any decision on the way forward been made?

@surelygroup nothing moving for now.

DHager's proposed solution in https://github.com/doctrine/orm/issues/5542#issuecomment-343024280 would solve our use-case too. And from a performance point of view it seems it could be reasonable.

While adding an extra check on each field will cost _something_, we should not forget the cost of the SQL query itself. In our use-case we have custom Point type. All entities with a field of that type will always execute and UPDATE statement. In many cases this is useless and it is a huge performance hit (compared to a function call).

The fact that there is always a changeset also happen to trip up our "last updated" mechanism. An entity with with a custom field of Point will always be marked as last updated "now", even if nothing actually changed. I guess there could be workarounds for that particular issue (by manually re-checking the changeset for actual changes), but it seems to me that Doctrine should not notify us of change if nothing is actually changed.

@Ocramius would you accept a PR (actually two, for DBAL and ORM) that implements DHager's solution ?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

strayobject picture strayobject  ยท  4Comments

doctrinebot picture doctrinebot  ยท  3Comments

doctrinebot picture doctrinebot  ยท  4Comments

Inmmelman picture Inmmelman  ยท  3Comments

doctrinebot picture doctrinebot  ยท  4Comments