Silverstripe-framework: RFC: Deprecate support for enums

Created on 24 Sep 2018  路  10Comments  路  Source: silverstripe/silverstripe-framework

Affected Version

Deprecate in 4.x and remove in 5.x

Description

Support in SilverStripe for enum DB fields is one of the main blockers for moving to use doctrine for example.

Some other reasons why we shouldn't use enums:

  • Semantic reasons, e.g. that the data storage is actually column metadata where it should be in a field
  • They aren't extensible - user code can't add new values to enum fields (without overloading them entirely)

We use enums for some quite fundamental parts of the framework, e.g. the ClassName field. We can replace this with application level logic and constant values instead.

Pros

  • ~Modernize the framework a little more~
  • Unblock #6632 a bit

Cons

  • Developers probably like using them because we've made them easy to configure
affectv4 affectv5 changminor efformedium impachigh rfdraft

Most helpful comment

I would suggest that we start by introducing a config option, eg. DBEnum.use_native_enum = true. If you set it to false then it will add a varchar + constraint to the database instead of an enum.

This will let enthusiasts try this feature out before we take the step of deprecating the current implementation, which personally I would only do if we were pretty far down the track of implementing DBAL

All 10 comments

Modernize the framework a little more

Can we elaborate on what alternative to enum we are advocating and why this would be a step forward? I agree that this would unblock #6632, and it would make our PGSQL implementation a little more consistent, but I'm skeptical of other benefits, and "modernize" as a benefit is a bit of a truism.

Using ClassName as a specific example

ClassName = varchar + Index + A separate constraint
Our constraint is managed separately, but provided this isn't a performance degradation, maybe this is okay. dev/build adjustments of allowed items might be quicker (assuming an enum is slower to modify than a constraint), but that requires testing. I don't see how this is a clear improvement, but it seems fine.

EDIT: The most significant change that this would make is the data usage of each record, which on big tables can get significant. Which may end up pushing us in the direction of the other 2 solutions, although one good thing we could do is avoid including the ClassName field on tables that don't support multiple classes.

And then there are some solutions which I think would be a backward step:

ClassName in a separate table and ClassNameID in the parent table
Now we need extra joins or preparatory getClassNameIDs() queries. Extra joins would be by far the worst of the two, slowing down every part of SilverStripe. Manual interrogation of the database in a MySQL is a bit harder, you can't see the enum values in your root table. Technically this is denormalised data but in my view excessively so.

ClassName is an integer an the lists are maintained in PHP
A lot of the problems of the previous solution. The getClassNameIDs() query is avoided but instead this list needs to be maintained in PHP/server-side somehow.


So, in my view, this is solely justified by #6632.

Let's leave that as the sole justification for now then =)

You can pry my _Enums_ from my cold, dead hands!!! :scream:

On a serious note, if it's holding up #6632 it sounds like a worthwhile sacrifice.

Sam's ClassName = varchar + Index + A separate constraint comments sounds sensible.

I'm wondering if we want to keep some sort of virtual Enum construct, that constrains the values you can store in your field at the Framework level without actually using the Enum type in the DB? Or would that be counter intuitive and/or defeat the purpose of the change?

Yes I think application level support would be a good thing to let people still use them without affecting the DB, it would also be extensible.

Doctrine have some examples of how we could do this in core: https://www.doctrine-project.org/projects/doctrine-orm/en/2.6/cookbook/mysql-enums.html

Yeah, my immediate thought when I saw this blocker in #6632 was varchar with ORM-validation. It鈥檇 essentially be a convenience wrapper to save you writing your own validator, but I think that鈥檚 fine. I鈥檓 not familiar with constraints at a database-level, are they well supported/supported in PGSQL/SQLite etc?

I鈥檓 trying to think if there are any drawbacks to not having a list of available values when inspecting a column. I can鈥檛 really think of a use-case for that, so it鈥檚 probably not worth worrying about - perhaps if we implement database-level constraints you could grab them from there...

My opinion is that we should remove native DB enum support but have an application layer support for them. That means we still retain the "Enum" DB Field type in the ORM we just don't construct an enum in the database. We can still define a list of allowed values for a field but we just don't store that list in the DB.

In my view, one of the biggest drawbacks of Enums is that it causes data corruption on dev/build when the enum values are changed, which makes rolling back database changes a huge pain.


Just as an aside, there are a lot of things blocking #6632 including:

  1. Close coupling to autoincremending DB IDs (that have to be pulled from the DB)
  2. mIXeD case identifiers (they need to be all lower case)
  3. a great love of writing, manipulating and injecting SQL in our core eco-system and modules (aka - DataQuery constructs SQL instead of a representation of a query in our domain language)

If we do address this we should also look at #1387, which is to say, we want to ensure that constraints get re-written when allowed elements change, or remove it altogether.

I have to say that removing all constraints from the database and just storing a varchar, relying on PHP to maintain data integrity feels like a step backwards:

  • Less integrity at the database level
  • More bytes per row, making any indexes longer聽鈥撀爐his can get material if we store the FQCN in ClassName as we currently do, although perhaps we can reduce this impact but assuming that names are 7-bit safe, and sizing the field to based on the values allowed.

I would suggest that we start by introducing a config option, eg. DBEnum.use_native_enum = true. If you set it to false then it will add a varchar + constraint to the database instead of an enum.

This will let enthusiasts try this feature out before we take the step of deprecating the current implementation, which personally I would only do if we were pretty far down the track of implementing DBAL

Just dropping back in to list another disadvantage of enums. When deploying changes to an existing SS4 site, the ChangeSetItem.ObjectClass column has to be updated to reflect any new/removed versioned DataObjects. It鈥檚 _really_ slow when you have a lot of changeset records...

https://github.com/silverstripe/silverstripe-framework/issues/6632 is not something we're planning on doing any more, so this has less drive than it had.

I like @sminnee's last idea of allowing config based opt-in.

I don't think there's been enough interest in this RFC since I opened it two years ago to pursue it any further, so I'm going to close it. Thanks to everyone for the input!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ScopeyNZ picture ScopeyNZ  路  5Comments

dnsl48 picture dnsl48  路  6Comments

Cheddam picture Cheddam  路  3Comments

kinglozzer picture kinglozzer  路  4Comments

maxime-rainville picture maxime-rainville  路  3Comments