Entt: range based `registry::insert_missing` and `registry::remove_if_exists`

Created on 11 May 2020 · 32Comments · Source: skypjack/entt

I think I found a legitimate use-case for the two above methods.

When you have relationship components which contain std::vector<entity> it's often not practical to generate a view which in the first case excludes the existing component, and in the second returns only the existing ones and matches all entities of the vector. Currently you have to iterate and add/remove the components individually.

enhancement

Source

sunbubble

Most helpful comment

Well, the documentation came with the very first version of EnTT and i wasn't sure that this behavior wouldn't change at the time.
Nowadays I suspect this won't change any time soon though! :smile:
So, yes, we can make it clear in the documentation, of course. Probably we should open a different issue for that btw.

skypjack on 15 May 2020

👍2

All 32 comments

Is insert_missing the equivalent of emplace_or_replace?

skypjack on 13 May 2020

Is insert_missing the equivalent of emplace_or_replace?

~Yes~

edit

Actually, yes and no. There are two use cases: range based emplace_or_replace where you'd change all components. And insert_missing where you'd only emplace the component if none is present (similar to a range-based get_or_emplace).

Sorry for the confusion.

sunbubble on 13 May 2020

This kind of methods are proliferating all over.
I'm wondering if there is a way to define a sort of _for-each_ method that accepts a predicate, so as to make all this as generic as possible...

skypjack on 13 May 2020

@sunbubble Do I get right what you want? You want to make a call like entt::insert_missing< Comp1, Comp2 > (container.begin (), container.end ()) and have components assigned only to those entities from the container that don't already have them?

Innokentiy-Alaytsev on 13 May 2020

👍1

A very raw and broken proposal:

reg::do_stuff< Position, Crap, Cat, Kitten, Pizza, Vodka > (begin, end, entt::overload{
[&reg](entt::entity ent, auto component) {
    using ComponentType = std::remove_pointer_t<decltype (component)>;

    if (!component) {
        reg.assign< ComponentType > (ent);
    }
    else {
        reg.remove< ComponentType > (ent);
    }
},
[&reg](entt::entity ent, Position* component) {
    if (component && reg.has< Velosity > (ent)) {
        reg.patch Position > (ent, *component + reg.get< Velosity > (ent));
    }
}
});

entt::overloaded here is the thing you already have somewhere in EnTT.

I'm not sure about the best signatures here. Just typed the shortest thing.
You could provide a view as an argument to the callback to enable faster lookup of the components of the same type, or the registry itself.
The pointer is there to make it possible to distinguish between an existing component and a non-existing one. I expect this to be internally called on a per-pool basis for each of the listed components.

I think some things like exclusion lists and all that can be added atop of that.

I don't think it's a good idea to separate predicates and actions. At least it should not be the only available option, because having a single function is simpler than having two in most cases.

Innokentiy-Alaytsev on 14 May 2020

@sunbubble Do I get right what you want? You want to make a call like entt::insert_missing< Comp1, Comp2 > (container.begin (), container.end ()) and have components assigned only to those entities from the container that don't already have them?

Exactly right 👍

Imagine you want to insert a countdown timer to a bunch of entities that are in a STL container, but you don't want to alter their timer component if they already have one.

For the range based emplace_or_replace, it's to avoid this pattern:

// create B
reg.view<const A>(entt::exclude<B>).each([&reg, &f](entt::entity e, const auto& a){ reg.assign<B>(e, f(a)); });

// update B
reg.view<const A, B>().each([&f](const auto& a, auto& b){ b = f(a); });

sunbubble on 14 May 2020

This kind of methods are proliferating all over.
I'm wondering if there is a way to define a sort of _for-each_ method that accepts a predicate, so as to make all this as generic as possible...

That would work too. I guess the motivator for this is that, creation and deletion of Components is still done mostly on an entity basis with direct calls to the registry.

sunbubble on 14 May 2020

I think what @Innokentiy-Alaytsev suggests is interesting. My only concern is that a function like this one will always contain a branch to probe the component internally. On the other hand, the current implementation for emplace goes straight to the point while emplare_or_replace only has a branch.
Probably we can get around this with a bunch of functions like do_stuff_and_all_entities_has<T>, do_stuff_and_all_entities_has_not<T> and do_stuff_and_who_knows_what_entities_are_these<T>. :smile:

skypjack on 14 May 2020

Is it possible to create a view with only exclusion list?

Innokentiy-Alaytsev on 14 May 2020

No. Technically it would be possible but to do that you'd have to iterate all entities and filter those that are in the pools for the given components. That is:

auto view = registry.view<T, U>();
registry.each([&](auto entity) {
    if(!view.contains(entity) {
        // ...
    }
});

Nothing more, nothing less.

skypjack on 14 May 2020

The code snippet above may be improved to exclude _some_ of the branches. For this, the function must be a member of a view or a group. In these cases, the pool for at least some of the components may already be there. The other thing that is possible with this addition is the application of the given function to components of each entity in the view or group instead of an arbitrary range. However, it doesn't differ much from a simple for loop.

Innokentiy-Alaytsev on 14 May 2020

I think what @Innokentiy-Alaytsev suggests is interesting. My only concern is that a function like this one will always contain a branch to probe the component internally. On the other hand, the current implementation for emplace goes straight to the point while emplare_or_replace only has a branch.

The branching is a concern, and should be clearly stated in the function name, same as get_or_emplace. I just wonder whether the direct accessing the registry as the first view in this example

// create B
reg.view<const A>(entt::exclude<B>).each([&reg, &f](entt::entity e, const auto& a){ reg.assign<B>(e, f(a)); });

// update B
reg.view<const A, B>().each([&f](const auto& a, auto& b){ b = f(a); });

is more expensive than the branching. I don't know about the internals enough to comment on that.
Furthermore is that if you want maximum performance in a hot path, then you're still able and willing to complicate the code.

Probably we can get around this with a bunch of functions like do_stuff_and_all_entities_has<T>, do_stuff_and_all_entities_has_not<T> and do_stuff_and_who_knows_what_entities_are_these<T>. 😄

This should work, but AFAIK it's a pattern not yet used in EnTT. This makes me thing of ways to expand the traversal semantics. I was wondering, is there any hard problems in exposing to the user of EnTT iterators/ranges that dereference to something like component reference tuples? So a component for-loop would be as fast as the each method? This would allow for a wealth of interplay with the STL algorithms.

sunbubble on 14 May 2020

is there any hard problems in exposing to the user of EnTT iterators/ranges that dereference to something like component reference tuples?

Unfortunately, the problem here is the C++ language itself and its standard library.
I'd like to return iterators from views the value_type of which is std::tuple<entity, Comp1 &, ..., CompN &>.
However, the standard says explicitly that the reference type of a forward iterator must be an actual reference and not a proxy object. In theory, this isn't strictly required for multi-pass guarantee though.
Let's suppose I decided to use a tuple-like object as value_type. This would turn the iterator in an input iterator. However, some algorithms like the parallel std::for_each require their arguments to be forward iterators. Therefore:

This would allow for a wealth of interplay with the STL algorithms.

No, this would restrict the possibilities to use them with the algorithms provided with the standard library.

That said, single component views and groups offer the raw<T> member function to get raw pointers to a packed arrays of elements.
We can get around the limitations above by using this stuff carefully.

skypjack on 15 May 2020

👍1

is there any hard problems in exposing to the user of EnTT iterators/ranges that dereference to something like component reference tuples?

Unfortunately, the problem here is the C++ language itself and its standard library.
I'd like to return iterators from views the value_type of which is std::tuple<entity, Comp1 &, ..., CompN &>.
However, the standard says explicitly that the reference type of a forward iterator must be an actual reference and not a proxy object. In theory, this isn't strictly required for multi-pass guarantee though.
Let's suppose I decided to use a tuple-like object as value_type. This would turn the iterator in an input iterator. However, some algorithms like the parallel std::for_each require their arguments to be forward iterators. Therefore:

This would allow for a wealth of interplay with the STL algorithms.

No, this would restrict the possibilities to use them with the algorithms provided with the standard library.

That said, single component views and groups offer the raw<T> member function to get raw pointers to a packed arrays of elements.
We can get around the limitations above by using this stuff carefully.

Thanks for the detailed reply. This is very enlightening.

sunbubble on 15 May 2020

That said, single component views and groups offer the raw member function to get raw pointers to a packed arrays of elements.
We can get around the limitations above by using this stuff carefully.

The problem of using raw<T> is that there are no guarantees on the order of the components even if the sort method was called. And most fancy computational complexity reducing algorithms, require sorted collections...

sunbubble on 15 May 2020

@sunbubble I'm pretty sure raw<T> is the reverse of the order imposed by sort. I ran into that a while ago.

Kerndog73 on 15 May 2020

@sunbubble I'm pretty sure raw<T> is the reverse of the order imposed by sort. I ran into that a while ago.

Maybe, but I find it hard to rely on that in production code given the documentation: https://github.com/skypjack/entt/blob/35c3acbade34a05053650b6dcb15deeeb318972a/src/entt/entity/group.hpp#L153

sunbubble on 15 May 2020

Yeah, the documentation is scaring but it works exactly as @Kerndog73 reported.
Iterators go back-front to allow insertions/deletions during iterations. It goes without saying that it won't be possible anymore if you iterate things front-back.

skypjack on 15 May 2020

👍1

Yeah, the documentation is scaring but it works exactly as @Kerndog73 reported.
Iterators go back-front to allow insertions/deletions during iterations. It goes without saying that it won't be possible anymore if you iterate things front-back.

I don't mind the order being reversed so much. Would it be possible to say in the documentation that the order will be reversed as imposed by sort? Because right now this is a feature which is hidden as being an implementation detail. And I would always have to dive in the code to figure out whether this is true.

The question is if you want to give guarantees that this behavior stays for the future.

sunbubble on 15 May 2020

skypjack on 15 May 2020

👍2

😄 https://github.com/skypjack/entt/issues/489

sunbubble on 15 May 2020

is there any hard problems in exposing to the user of EnTT iterators/ranges that dereference to something like component reference tuples?

Unfortunately, the problem here is the C++ language itself and its standard library.
I'd like to return iterators from views the value_type of which is std::tuple<entity, Comp1 &, ..., CompN &>.
However, the standard says explicitly that the reference type of a forward iterator must be an actual reference and not a proxy object. In theory, this isn't strictly required for multi-pass guarantee though.
Let's suppose I decided to use a tuple-like object as value_type. This would turn the iterator in an input iterator. However, some algorithms like the parallel std::for_each require their arguments to be forward iterators. Therefore:

This would allow for a wealth of interplay with the STL algorithms.

No, this would restrict the possibilities to use them with the algorithms provided with the standard library.

That said, single component views and groups offer the raw<T> member function to get raw pointers to a packed arrays of elements.
We can get around the limitations above by using this stuff carefully.

I was recently reading up on C++20 ranges and stumbled across this new std::common_reference https://en.cppreference.com/w/cpp/types/common_reference and a related SO answer by Eric Niebler https://stackoverflow.com/questions/59011331/what-is-the-purpose-of-c20-stdcommon-reference. Wouldn't this completely solve this problem?

sunbubble on 21 May 2020

Wouldn't this completely solve this problem?

Indeed yes, C++20 will be a game changer for EnTT. However, it's far from being usable at the moment, so we have to wait for a while...

skypjack on 21 May 2020

👍1

is there any hard problems in exposing to the user of EnTT iterators/ranges that dereference to something like component reference tuples? So a component for-loop would be as fast as the each method? This would allow for a wealth of interplay with the STL algorithms.

I've recently added an overload for ::each that returns _iterable objects_.
Unfortunately, as mentioned, iterators are input ones because of the library restrictions.

skypjack on 11 Jul 2020

is there any hard problems in exposing to the user of EnTT iterators/ranges that dereference to something like component reference tuples? So a component for-loop would be as fast as the each method? This would allow for a wealth of interplay with the STL algorithms.

I've recently added an overload for ::each that returns _iterable objects_.
Unfortunately, as mentioned, iterators are input ones because of the library restrictions.

This is super awesome! Thanks for the work.
Even with the restriction, there's a bunch of extremely useful algorithms which only require input iterators. For instance, in the STL all 5 set operation algorithms would work and std::transform_reduce would as well.

sunbubble on 11 Jul 2020

Note that it dereference to an std::tuple<Entity, Component &...> where the const-ness of the components is as requested.

skypjack on 11 Jul 2020

Note that it dereference to an std::tuple

Is it then still able to leverage the optimization of entt::each of only iterating over used components?

sunbubble on 11 Jul 2020

Uhm... what do you mean exactly? It returns all and only the entities that have all the components, of course. I don't get what you mean with _used components_ though.

skypjack on 11 Jul 2020

I was under the impression that iterating over

r.view<A, B, C, D, E, F>().each([](entity, auto &a, auto ...){
    // do something with a
});

is faster than the iterating over

r.view<A, B, C, D, E, F>().each([](entity, auto &a, auto &b, auto &c, auto &d, auto &f){
    // do something with a, b, c, d, e, f
});

since the instances of types B, C, D, E, F don't need to be returned in the first case.

The question is whether the iterable tuple idiom allows this optimization as well. Because often the later set of Components servers simply as a filter, and you're not interested in iterating over those instances.

edit updated the example code to be more explicit.

sunbubble on 11 Jul 2020

EnTT doesn't care much if you put a pack or not as the last argument of your lambda.
This is an optimization made by the compiler, if any. It _sees_ that you don't use the parameters and prunes the code that reaches them.
I've no idea if it can do the same with an iterable object to be honest. In theory, yes. In practice, it's implementation defined, so dunno.

Also note that in both cases empty types aren't returned. I usually use this kind of components as filters, so, in this case, there are no differences.

My suggestion is: if you care, try and measure. This is probably the best thing you can do.
However, it may be that a compiler manages to make this optimization while another one does not.
For example, clang does a great job at vectorizing views and groups while msvc has more difficulties because it isn't (wasn't?) that great at optimizing templated stuff.

skypjack on 11 Jul 2020

👍1

I was re-reading the whole discussion and noted this:

// create B
reg.view<const A>(entt::exclude<B>).each([&reg, &f](entt::entity e, const auto& a){ reg.assign<B>(e, f(a)); });

// update B
reg.view<const A, B>().each([&f](const auto& a, auto& b){ b = f(a); });

You can already do what you want with a get_or_emplace here:

for(auto &&[entity, a]: reg.view<const A>()) {
    reg.get_or_emplace<B>(entity) = f(a);
}

This is literally equivalent to a range based emplace_or_replace. The latter would look like:

auto view = reg.view<const A>();
reg.emplace_or_replace<B>(view.begin(), view.end(), ??);

Where I wouldn't even know what to put in place of ?? since the actual values depend on the instances of A.

Does it make sense?

skypjack on 20 Jul 2020

This is literally equivalent to a range based emplace_or_replace. The latter would look like:
auto view = reg.view<const A>();
reg.emplace_or_replace<B>(view.begin(), view.end(), ??);
Where I wouldn't even know what to put in place of ?? since the actual values depend on the instances of A.

Does it make sense?

It is, but it's still quite similar to iteration, in that you want the ?? to be a generator function and not a container. This is kind of what we have in the view.each(??), the only difference is that you don't construct the instance there.

This is obviously most useful when the new value does not depend on the previous one. But it's simply an ergonomics question.

sunbubble on 22 Jul 2020

Was this page helpful?

0 / 5 - 0 ratings