This is a (multiple allowed):
[x] feature-discussion (RFC)
CakePHP Version: 3.5.12
I tried to call Entity::get() on a non-existent property. This method.
An example might be a Catalog, which has associated Content, which also has associated Advertising.
$catalog->get('content')->get('adverts')
My application failed with a fatal exception. Because I tried to access an empty property.
Cannot call get() on null
I would expect a way to allow for optional associations, without having to include lots of state checking into my templates.
if (!empty($catalog->get('content')) && !empty($catalog->get('content')->get('adverts')) {
// Output the adverts
}
This is what I am having to put into my templates to ensure that I can sidestep the potential fatal. Which, as you can imagine soon gets out of hand with many levels of associations.
I also have to consider my front-end developer colleagues whos primary language isn't PHP, and this could seem confusing. Or even worse, if they copy and paste incorrectly, they'll prevent an entire page from rendering.
I'm hoping for some discussion, but I have had a few chats in Slack about the issue. Usually born out of frustration because I forget to include all the isset() or !empty() checking, and hit a fatal.
The general gist of the conversation that @hmic and I had was the inclusion of an entity trait which would allow the entity to avoid the fatal and instead just return null and prevent any further get() calls from happening. This was only if the complete association was resolvable would anything be returned.
Marks idea was to create functional separation in the entity with something like get() and getIfExists(). Which would allow for more flexibility in how the developer wanted to interact with the entity. Alternatively adding an extra param to get($property, $defaultReturnValue) to allow for overriding.
Adding a new method is my favourite, as it maintains BC, and creates a nice functional separation in the api, with clearly defined behaviours, which are obvious from reading the code.
Please add any thoughts you have on this topic, and if you've struggled with this fatal exception yourself how you have coped with it in your templating.
getIfExists() is a bit verbose. What about read() or fetch() or query() or something like that. You will find a better verb ;-)
https://api.cakephp.org/3.5/class-Cake.ORM.Entity.html#_has
!$catalog->has('content') ? null : !$catalog->content->has('item') ? null : $catalog->content->item
I'd still go with a trait that implements __get() and returns whatever you want in failure case - probably an EntityInterface still with the trait attached, so your chain works till the expected end (and typecasts to an empty array or empty string or null or whatever)
You can use Hash::get or write a wrapper like this
<?php
namespace App\Model\Entity;
use Cake\ORM\Entity;
use Cake\Utility\Hash;
class AppEntity extendes Entity
{
function getDeep($property, $default = null) {
return Hash::get($this, $property, $default);
}
}
You would use it like $catalog->getDeep('content.adverts') and would return null if not exists
Another way around this is to use Twig templates. In a non-strict variables mode they smooth over these kinds off issues.
I am no fan of the no-strict modes however. That's something you run in production maybe but nothing I'd rely on as a feature TBH.
What would be wrong with a method?
What would be wrong with a method?
Nothing is wrong with a method. There was an attempt to add dot paths to get() but it didn't get merged because the set() changes were very complicated, and I wasn't a fan of breaking the symmetry of those methods.
read() makes sense IMO, it is consistent with other places where read()/find() are soft getters, and get() always a hard one (existing otherwise exception).
So this could work?
$catalog->read('content')->read('adverts', 'hello-world')
Would return null (by default) if $catalog has no 'content' else would return 'hello-world' if content has no 'adverts' but returns catalog.content.adverts is a continous existing path it just returns that value.
$catalog->read('content')->read('adverts', 'hello-world')
That is a very bad idea.
If you chain, you should never use a soft getter, otherwise you run into a null->objectMethod() fatal error which is worse than what you currently get.
So what about returning a NotExistingObject that just implements read and default values in case of no value found?
Not feasible, what if you call any other method on the entity that "should exist"?
In these cases maybe helpers or elements with conditional output might be better suited.
So you consider dot-path access the better way? though markstory hinted that it was deemd to complex to do?
Yes, that would be the better way - you can try giving this a shot for this specific use case.
@markstory maybe instead of breaking the symmetry just name it fetch() or read() (instead of adding more functionality to get()) which takes the dot notation and will return null if nothing is found?
Having a different name would avoid the symmetry issues that dot path get() has.
My vote would be for a fetch() or read() method which supports dot notation, and as @markstory says, avoids the symmetry break from adding dot support to get().
I think I'd prefer fetch() as it semantically can be differentiated from read(), as this data is being pulled out via classes and methods, whereas read() is used elsewhere, e.g. Configure::read(), which I think indicates it's more simply reading from a defined array.
I'd opt for read() with dot notation. Just like Configure::read() does too!
It's not so much about differentiating where and how the data gets pulled from (Configure supports more than just php array backends) but how to use it, to make a consistent API. Same usecase here, same method signature, same expected returns...
Sounds like a perfect fit!
@hmic ah, I stand corrected. In which case, yes, my vote would be for consistency with ::read().
@jeremyharris would that work with LazyLoad behavior anyway?
@inoas the lazy load plugin uses the entity ->get() method, so a new method would require adding support in the lazy load plugin unless the new method uses ->get() to fetch the values, which I'm guessing it will in order to respect accessors. I don't quite remember how the original PR did it.
Since I'm here, I'll toss a new method name in the mix: getSafe()/getDeep(). The only issue I have with using the work "safe" is that devs might assume that it's also escaping, but a docs could clear that up.
So we are back to read() then, which would be consistent to Session::read() and Configure::read() that allow the exact same syntax :)
@jeremyharris raised a good point about accessors. How do folks see accessors interacting with deep properties? Does each accessor in the property chain get invoked, or does only the leaf property have its accessor called?
Here's a link to the original RFC: https://github.com/cakephp/cakephp/pull/7847
@markstory I would expect it to call the accessors, which by the looks of the code in the RFC it was.
One thing I don't see as clear from the RFC is how it would work with an array, like a 1:n association. Maybe I'm just missing it..
I'd like to add to the open questions if it wouldn't only call accessors but also if it would work with virtual properties (which could very well act as a switch between associations?).
As for 1:n I'd say the dot notation should not do that because that could easily become 1:n:n:n etc.
Just let the author of templates handle it with conditions and loops?
Edit: Also there could be a mode toggle for only the leaf's accessor to be run or all accessors. Both can be desireable and especially for speed maybe you only want to run it for leafs but to make things work not in unexpected way, run it for the whole chain by default?
I think the dot notation handles 1:n just well, if you know the index, like:
book.chapter.0.page.2.headline
I don't think we should discuss optional or conditional accessor access too. It's how the entity interface is supposed to work and I'd not see the need to change it intentionally, except there are special cases that need handling anyways. If the author of a template chooses to implement additional loops and conditions, that's fine. but nothing this read() method would either be needed for, nor could potentially cater for in a sane way. Additionally,you can just use the Collection to get your custom handling done in many of these cases with ease.
I would expect the method to work on any entity, even if it's nested, making the following comparable.
$catalog->get('content')->get('adverts')[0]; // returns AdvertEntity
$catalog->get('content')->read('adverts.0'); // returns AdvertEntity
$catalog->read('content.adverts.0') // returns AdvertEntity
I do like the dot notation idea here, as it's clear, obvious and already established elsewhere in the framework.
As for the chain, wouldn't the method fail on the first property which doesn't exist, and then return either null or the configured default value?
@davidyell that's my understanding too, if the chain fails in any point, it should return null.
Ah I got you guys wrong. I meant that book.chapter.{n}.page.{n}.headline should not work, should that?
@hmic or null|void then void for that matter in php7.1/cakephp4?
Void with a getter? Why would that ever make any sense?
null is a valid value and you would not know if you got null or no-value. We could also return an object to test against that on __toString() becomes '' - but why not just use void to make it easy to check if a value exists - or add check() or exists()?
@inoas:
null is what you want in return if not found.
Additionally, like I told before - API consistency is king! The other common places where you use read() with dot notation act the same, returning null in these cases.
I actually don't see a usecase for void like @dereuromark mentioned too. Methods that are not supposed to return anything use void to indicate this, but you actually expect this to return something!
null is a valid value and you would not know if you got null or no-value. We could also return an object to test against that on __toString() becomes '' - but why not just use void to make it easy to check if a value exists - or add check() or exists()?
Hash::get (which configure::read() uses) does return null if the path does not exists or if the value is null
$array = [
'nullValue' => null,
];
Hash::get($array, 'nullValue'); // null
Hash::get($array, 'notExists'); // null
I prefer to stick with that approach
Also Hash::get does not support {n} notation, so it would be a better option than extract (also extract always return array)
Since this is mostly concerning the view layer, why not create a EntityHelper, or even just a short function __e(), similar to h(), __d().
Eg:
// instead of the following
if (!empty($catalog->get('content')) && !empty($catalog->get('content')->get('adverts')) {
// Output the adverts
echo h($catalog->content->adverts->name);
}
// we do this:
<?= __e($catalog, 'content.adverts.name') // return escaped `$catalog->content->adverts.name` or return empty string when content or adverts is missing ?>
We don't modify the entity class at all
Option 3.
Make the orm lazy loader default behaviour or a configurable option in core.
Sometimes tracking all your eager loaded objects is too much to track.
@xaphalanx that doesn't help for cases where the related record doesn't exist in the database, and also doesn't help in cases where you might not be chaining calls on an entity but instead something else.
@xaphalanx as the creator of the lazy loader, I would 馃憥 making it the default. It's not (yet...?) robust enough to handle 1:n efficiently, and making it the default would cause all sorts of "cake is slow" comments. Last thing I want is to be responsible for someone doing 1K queries where 3 would suffice :D
Please add any thoughts you have on this topic, and if you've struggled with this fatal exception yourself how you have coped with it in your templating.
There are two layers of complexity for me when I think about property checking: 1) associations are eager loaded so in my template I have to know what tables I've selected, and 2) I have to know what what columns were selected -- just because I have an entity doesn't mean it has all it's properties. If I have an entity with a isPublished() method, I already have to check for the existence of the necessary properties and throw an exception if they are not present before doing the logic to determine if it is published. If I didn't do that checking, I could return true/false erroneously if some other developer uses select() on the query. Given these 2 problems, I'm questioning complex entity property checking in a template as a strategy.
Here are somethings I've done instead of property checking when I've run into this issue:
If I am 3 layers deep in a view, I'd be thinking about the data and whether or not I am accessing it efficiently in the first place. Is there a join table I could use as the primary table in my query?
If I determined I really did need that data in that shape, then I'd think about running a reducer(s) to transform the data passed to the view -- setting defaults along the way -- and passing the result to the view.
A lazy loader in the case of a complex system where it is difficult to determine all the things needed upfront makes some sense to me as a stop gap solution.
I wouldn't want to debug a problem where I called get() and a default value set in a template was used instead of the valid value stored in the database because a developer used select() or failed to contain() the associated entities.
There is a lot of developer preference in this issue around whether calling a variable in a view that is not set should stop execution or not. I prefer stopping execution.
How about getOrFail() as not nullable and current get() as nullable one?
This way it is in sync with the rest of the classes and their API here.
Is there any interest in providing a PR for my recommended strategy?
If you feel strongly about clean code, you can also transform into DTOs and use those then to iterate through the content with all the type-safety and sanity checks in place.
I added a Trait idea to Shim plugin as enhancement PR:
https://github.com/dereuromark/cakephp-shim/pull/48
This would basically enable this. Should this be a core trait? Or remain as plugin?
How about getOrFail() as not nullable and current get() as nullable one?
@dereuromark null is valid value for entity
so getOrFail() should accepts the null IMO
probably we able to use getChild() naming for this purpose
getChild() should returns en entity , runTimeError/exception in other types
@dereuromark Your solution doesn't seem to cover nested associated data.
It does. Or how u you mean? You chain it.
Trait would require _try-catch_ which is even worse in templates.
In such case I use $entity->deep->deeper->deepest ?? null which also let you have autocomplete and typehints.
I think both @saeideng and @garas you misunderstand the key concept of the trait.
This is NOT about the nullable use case, but the non-nullable one. The use case where you expect (!) the values to be present.
Status quo: Any entity property can be null, any. If you forget to hydrate it (select without that field), then it will silently be null. The null pointer exception is usually worse than a clear exception thrown by the trait.
As such, this aims not to remove/replace nullable usage, but in fact adds support for proper and safe non-nullable (chaining) usage.
And by chaining, it also supports any depth of nested/associated data as well just fine.
It also pleases static analyzers like PHPStan or Psalm, as they now also know what to expect and how to check the usage of these entity properties in the scope of each use case.
I clarified the importance and how it completely works together nicely with IdeHelper in https://github.com/dereuromark/cakephp-shim/releases/tag/1.8.0
If we do not plan a core adaption here, we can close this ticket then.
Trait would require try-catch which is even worse in templates.
true
another suggestion can be this method
$entity->getX('child','child2','child3',...)
this will return the last child or null otherwise
and IdeHelper will unhappy with this method
You are free to further enhance the API or provide alternative traits/methods here, also for dot syntax for example: getOrFail('foo.0.attributes') etc.
But yeah, you need to be aware of all environment variables here, and for me IDE support is a major concern, as it should be also easy to use for developers, apart from the importance of static analyzers being able to precheck code for less mistakes/bugs.
I intentional did not suggest the dot syntax
I think we can not use dot syntax in entity's property
Entity::get() with paths has been attempted a few times in the past and it isn't a simple problem to solve.
I've never personally experienced this pain as I long ago switched to Twig based templates which automatically handle these null reference scenarios. Have other folks with this problem considered that as a solution?
This is not only for templates but also static analyzing. See my linked shim pr and examples.
This fixes usage across the whole app.
This is NOT about the nullable use case, but the non-nullable one.
This is outside the scope of this RFC in my opinion, as it's specifically concerned with the templates throwing exceptions. Which is why your trait would not be a good inclusion to the core for this RFC, as it is replicating the existing exception throwing present in the core already.
As we've been round in a few circles here with everyone having different ideas, I think, at least for me the best solution is to implement my own plugin. Which might be the best approach for other as well.
I will close this RFC now, as I feel like we're not really any closer to getting something implemented into the core, which was my hope for this RFC.
Thanks to everyone who contributed to the discussion. If the core team would still like to pursue the discussion do feel free to re-open the RFC.
Sry, I misunderstand the 2nd part of your request.
Please see https://github.com/dereuromark/cakephp-shim/pull/49 to complete the first trait now.
As per those methods, you can now choose in the template if you require the path to be present (required fields), or if you are OK with part of the path being null and silently just returning null here, skipping the !empty() checks in the middle.
That should now be in line what you suggested and what we discussed earlier here.
Most helpful comment
I would expect the method to work on any entity, even if it's nested, making the following comparable.
I do like the dot notation idea here, as it's clear, obvious and already established elsewhere in the framework.
As for the chain, wouldn't the method fail on the first property which doesn't exist, and then return either
nullor the configured default value?