Silverstripe-framework: RFC: HasManyList::remove(ByID) methods should delete the records as well as unlink them

Created on 10 Jan 2019  Â·  22Comments  Â·  Source: silverstripe/silverstripe-framework

Affected Version

^4.0

Description

When calling $someObj->SomeHasOneRelation()->removeAll(), I'd expect that the records are deleted from the database.

Instead, they are unlinked (foreign key removed) and left in the database:

https://github.com/silverstripe/silverstripe-framework/blob/4.3.0/src/ORM/HasManyList.php#L98-L139

This RFC is to follow up on the comment @todo Maybe we should delete the object instead? to suggest that yes, we should delete the object instead.

Options

  • Option 1: Remove the todo, make remove() delete the records. This change might be unexpected for some devs who are used to this behaviour, since it's existed since for ages: https://github.com/silverstripe/silverstripe-framework/commit/050e0675ce0ad2520e34f98352593236a599dd60#diff-dea426b15b401bfb781b57db6b785a45R64
  • Option 2: Add new method deleteAll() to DataList which is explicit in that it deletes all records. Listing this since it's an option, but I don't like it. The base behaviour of DataList::removeAll() is to delete all the records as well as remove them from the DataList instance, it's HasManyList which deviates from this logic in its own implementation.

Steps to Reproduce

diff --git a/tests/php/ORM/HasManyListTest.php b/tests/php/ORM/HasManyListTest.php
index 61d42178e..108095b7e 100644
--- a/tests/php/ORM/HasManyListTest.php
+++ b/tests/php/ORM/HasManyListTest.php
@@ -78,6 +78,25 @@ class HasManyListTest extends SapphireTest
         $this->assertEmpty($team2comment->TeamID);
     }

+    public function testRemoveAllDeletesRecords()
+    {
+        /** @var DataObjectTest\Team $team */
+        $team = $this->objFromFixture(DataObjectTest\Team::class, 'team1');
+
+        $this->assertCount(2, $team->Comments(), 'team1 fixture should have two comments');
+
+        $commentIds = $team->Comments()->column('ID');
+        $team->Comments()->removeAll();
+        $this->assertCount(0, $team->Comments(), 'team1 comments should be removed');
+
+        $commentsFromDatabase = DataObjectTest\TeamComment::get()->byIDs($commentIds);
+        /**
+         * The assertion below is currently failing, since the expected behaviour in HasManyList is to remove the
+         * relationship but leave the record in the database.
+         */
+        $this->assertCount(0, $commentsFromDatabase, 'Comments should be deleted from the DB');
+    }
+
     public function testDefaultSortIsUsedOnList()
     {
         /** @var Company $company */

1) SilverStripe\ORM\Tests\HasManyListTest::testRemoveAllDeletesRecords
Comments should be deleted from the DB
Failed asserting that actual size 2 matches expected size 0.

Related changes

  • 050e0675ce0ad2520e34f98352593236a599dd60: ENHANCEMENT: Created HasManyList and ManyManyList objects that represent relations (22/11/2009)
  • c74f7e764003f7b9cc2087d49f63dd73ca40ac4f: BUG Fixes issue where items could be deleted from a has_many relation by an entirely unrelated HasManyList calling delete on that item (12/07/2013) - note: not directly related, but in the general area of removing HasManyList relations
affectv4 efforeasy feedback-requirecore-team impacmedium rfdraft typenhancement

Most helpful comment

This feels like you are throwing away one of the major benefit of using a many_many, which is that it allows you to independently manage an objects' relations independently of the objects themselves. I would personally be horrified if an upgrade to silverstripe started deleting my related records, when they previously (and intentionally) unlinked them.

The only semantic issue I see with doing this would be that DataList::removeAll() and DataList::deleteAll() would both do the same thing. It'd be HasManyList that differs. We could make this clear in docs and change the removeAll() behaviour to not delete in 5.x?

Better solution: Deprecate remove() / removeAll(). Add unlink / unlinkAll / delete / deleteAll. Since remove / removeAll() is already polluted with ambiguity, I would advise against re-using it with a different behaviour.

For has_many relations, unlink will set the foreign ID to 0. For many_many, it will remove the mapping. For delete, it will delete both the record, as well as the mapping table (for many_many). For arraylists, unlink removes them from the list, and delete deletes the record. Consistency!

Not sure how to define unlink for DataList. Maybe it isn't supported by non-relational lists? I would guess that delete() / deleteAll() would remove the record, unlink() / unlinkAll() would be no-op (to prevent it deleting the source record).

This is the same approach we used to the recursive / single publish() problem for versioned. There was a method that was ambiguous; so instead of changing the behaviour, we deprecated it and added two methods that were un-ambiguous, publishRecursive() / publishSingle() :)

All 22 comments

Makes sense to delete it to me (option 1). I expect RemoveAll is a fairly rarely used method so the impact might not be _that_ high.

You might be surprised 😆this isn't the first time I've had to work around this issue

I'd be a bit anxious about changing the behaviour of removeAll() on the 4 branch. I'd be OK with introducing a deleteAll() to complement it.

What about introducing a parameter on removeAll() to delete records too, defaulting to false in 4.x and to true in 5.x?

There's a quite a difference between unlinking records and deleting them. I think it warrants having 2 distinct methods.

The only semantic issue I see with doing this would be that DataList::removeAll() and DataList::deleteAll() would both do the same thing. It'd be HasManyList that differs. We could make this clear in docs and change the removeAll() behaviour to not delete in 5.x?

You mean like if it's called on an ArrayList?

I guess an ArrayList wouldn’t do anything differently, but as mentioned in the issue DataList removeAll already deletes its records by default whereas HasManyList doesn’t

This feels like you are throwing away one of the major benefit of using a many_many, which is that it allows you to independently manage an objects' relations independently of the objects themselves. I would personally be horrified if an upgrade to silverstripe started deleting my related records, when they previously (and intentionally) unlinked them.

The only semantic issue I see with doing this would be that DataList::removeAll() and DataList::deleteAll() would both do the same thing. It'd be HasManyList that differs. We could make this clear in docs and change the removeAll() behaviour to not delete in 5.x?

Better solution: Deprecate remove() / removeAll(). Add unlink / unlinkAll / delete / deleteAll. Since remove / removeAll() is already polluted with ambiguity, I would advise against re-using it with a different behaviour.

For has_many relations, unlink will set the foreign ID to 0. For many_many, it will remove the mapping. For delete, it will delete both the record, as well as the mapping table (for many_many). For arraylists, unlink removes them from the list, and delete deletes the record. Consistency!

Not sure how to define unlink for DataList. Maybe it isn't supported by non-relational lists? I would guess that delete() / deleteAll() would remove the record, unlink() / unlinkAll() would be no-op (to prevent it deleting the source record).

This is the same approach we used to the recursive / single publish() problem for versioned. There was a method that was ambiguous; so instead of changing the behaviour, we deprecated it and added two methods that were un-ambiguous, publishRecursive() / publishSingle() :)

It's fine to remove @todos and not do them by the way. :)

But unlink deletes :trollface: http://php.net/manual/en/function.unlink.php

I agree with @tractorcow's idea. The remove methods aren't governed by an interface at this point, so we should be able to do it in a minor 4.x release

Ping @silverstripe/core-team for feedback please

A@robbieaverill my vote is either

1) @maxime-rainville's suggestion remove($x), removeAll, removeByID removes an item from a list, $list->delete($x) deleteAll() deletes the object.

2) Leave it as is and document it on the method. Deleting the object should be done other ways.. (e.g $this->List()->find(2)->delete();

Maybe we could do both, but target the new methods at SS5

I think a list shouldn't be able to delete an item (from existence), only removing them from the collection — so agreeing with what @wilr said in his second point.

My vote is for leaving as is, and I don't think we should expand the API, either. HasManyList is a relation list. It's for managing relations. If you want to manage the state of universal data, you have DataList. I don't think we should overload the single purpose of these classes.

The issue from my perspective is that both has-many and many-many may be either compositional or relational.:

  • Car has_many Wheel is compositional
  • Car has_many Passenger is relational

In has-many: For a compositional relationship remove() should delete. But for a relational relationship it should not.

Compositional relationships for many-many are less common, but you do occasionally get them – e.g. a content block system where a block could be placed on multiple pages, but there's not a separate block manager for seeing blocks divorced from pages.

Using that example to describe compositional many-many relationships, if you remove a block from a page you want to delete the block iff it's no longer linked to any other pages.

The fact that there are these 2 kinds of relationships for which we have 1 representation is why we get arguments about what the behaviour could be.

But wait, haven't we had this discussion?

Yes! cascade_deletes was added as a way of determining which kind of relationship you have. It was a big topic during versioning/ownership discussions.

So I think that:

  • HasManyList->remove() should delete the record iff the relationship is listed in cascade_deletes
  • ManyManyList->remove() should delete the record iff the relationship is listed in cascade_deletes and no other records link to the object-to-be-deleted via that relation

Is it an API change

No. cascade_deletes is new in SS4, and used very frequently, and the behaviour being described is a logical piece of the idea of cascading deletes. So it's a bugfix IMO.

In light of @unclecheese's comment, we could still update the docs to clarify the current behaviour.

In light of @sminnee's comment above (good point), we could add new APIs that are explicitly deleting the relationship, whether it's relational or compositional. We could also/instead make both lists delete if cascade_deletes matches up with it as suggested. I'd be fine with both approaches but I think the new API would cause less accidental headaches for devs to be honest.

In light of @sminnee's comment above (good point), we could add new APIs that are explicitly deleting the relationship, whether it's relational or compositional.

My recommendation was that we use cascade_deletes for this purpose rather than creating another API. I feel like having owns, cascade_deletes, cascade_duplicates and (for example) is_compositional is overkill.

+1 for solving this through cascade_deletes in a minor release. There'll be context-specific cases where the same relationship requires either deleting or unlinking, which can't be solved with a per-relationship configuration property like this. But then you should be expected to do more legwork as a developer (collect object identifiers, call HasManyList->removeAll(), loop through the collected identifiers and explicitly call DataObject->delete()).

I've had to look this behaviour up quite some times so my 2cts as non-native English speaker;
I find the remove() ambiguous - remove from what? The list, the database, the filter query?
If I'd be coming fresh to the API going with the car analogy, semantically:

  • I'd expect remove() to remove stuff (be that wheels or passengers) from an entity/list (the car in this instance)
  • (Actually unlink would be more accurate, but that's already for file deletion)
  • I'd expect delete() to, well, delete stuff

Most intuitive would seem if there'd be a delete() and a supplemental remove() method on relation lists, possibly with the ambiguous remove(true) being deprecated. Remove would seem a logical semantical opposite (Dutch: tegenovergesteld) to ->add(). It would then seem logical you can't remove() from a (non-relational) DataList, nor an object itself (only delete()).

Screenshot 2021-02-20 at 07 10 14

Was this page helpful?
0 / 5 - 0 ratings