Entt: More on Save/Restore

Created on 18 Apr 2018 · 16Comments · Source: skypjack/entt

This is an interesting comment about save/restore functionalities.
I'm opening a new issue because I plan to offer something more and want to discuss everything with anyone interested.

Here are parts of the comment above linked:

Is there a possibility to limit it to a subrange of entities? maybe a query like the ones in a system? In my example, i might want to replicate the Position component, but only on the entities that have a NetSerialize component. A interesting thing would be to have the snapshot to only be done on an array of entity IDs. I could do a view to see all the entities that have NetSerialize, and then only run the snapshot on those.

Alternatively, i could just do more "manual" serialization by having a Serialize system that reads NetSerialize and then adds every component it can (using has()) into a binary array or similar.

There is also quite a lot of wasted space just by all the times that entity id is serialize, wich is N + N*NComponents. Do you think a way to flip the serialization around so it ends up like "Entity1-C1-C2-C3""Entity2-C1-C4" would be possible? While this would definitely be more complicated, it would provide a huge boost to space.

I will answer the questions and I will make some proposals tomorrow morning (forgive me, it is almost midnight here).
In the meantime, feel free to comment if you want.

@dbacchet (the original question was yours) @vblanco20-1 (thanks for contributing to the discussion)

discussion enhancement

Source

skypjack

👍1

Most helpful comment

Of course @skypjack . Count me to test it.

vblanco20-1 on 23 Apr 2018

🎉1 👍1

All 16 comments

@vblanco20-1

Is there a possibility to limit it to a subrange of entities? maybe a query like the ones in a system? In my example, i might want to replicate the Position component, but only on the entities that have a NetSerialize component. A interesting thing would be to have the snapshot to only be done on an array of entity IDs. I could do a view to see all the entities that have NetSerialize, and then only run the snapshot on those.

I've already put a note in the TODO file (branch experimental). I want to extend the save/restore part with more functionalities actually.
Let's consider your case. You want to save all the entities that have Position and NetSerialize, but you don't want to store NetSerialize, right? It works like a tag for the entities to pick up, am I wrong?
In this case, a _view like_ approach wouldn't work fine probably. My idea was to provide something along this line for the cases in which the user wants to apply a filter:

registry.snapshot()
    .component<Position>(output, [&registry](auto entity, const Position &) {
        return registry.has<NetSerialize>(entity);
    }).component<AnotherComponent>(output);

What do you think about?

I don't like the idea of considering a list of components as if they were in _and_ honestly.
Today, you can serialize 100 components at once with something like:

registry.snapshot().component<Comp0, ..., Comp99>();

If you want to serialize all of them, not only the entities that have all of them, it would become instead:

registry.snapshot()
    .component<Comp0>()
    // ...
    .component<Comp99>();

Pretty annoying indeed. Isn't it? A filter function seems more reasonable, even though I admit it's not perfect as well.
Currently it's hard to store all the entities that have components A, B and C along with their components. With a filter function it could be possible, but annoying as well:

registry.snapshot()
    .component<A>(output, [&registry](auto entity, const auto &) { return registry.has<B, C>(entity) })
    .component<B>(output, [&registry](auto entity, const auto &) { return registry.has<A, C>(entity); })
    .component<C>(output, [&registry](auto entity, const auto &) { return registry.has<A, B>(entity); });

Any suggestions to improve the whole thing?

There is also quite a lot of wasted space just by all the times that entity id is serialize, wich is N + N*NComponents. Do you think a way to flip the serialization around so it ends up like "Entity1-C1-C2-C3""Entity2-C1-C4" would be possible? While this would definitely be more complicated, it would provide a huge boost to space.

First of all, the first N should not be considered if what you want is to serialize a type of component (like in your case). Just avoid calling entities and invoke orphans on the loader instead.
That being said, the save/restore stuff doesn't save anything actually, right? It's in charge to provide the archive with all the entities and the components requested. You can easily work on the archive to reduce the wasted space to a minimum.
You are proposing to move the logic I would put otherwise in an archive directly into the snapshot class. I suspect it would break definitely the interface of that class and honestly I see only a way to do that. Something like this:

registry.snapshot().component<A, B, C>(archive, [](auto entity, const auto &... components) { /* ... */ });

That is, a user must specify all the components in which he's interested at once. Then I can put compile-time machinery somewhere so as to invoke the filter with the list of components owned by the given entity and then... well, I don't know. It sounds complicated and it seems to me that it introduces a lot of limitations that will represent a problem to add new features in future.

Probably it's worth it to define a sort of CRTP based archive class that rearranges the flow of data, reduces it the way you suggested, then returns everything to the derived class once the serialization is over.

What are your thoughts on this?

skypjack on 19 Apr 2018

What you have there does look good. A "view" based approach to filter entities would be flexible enough for pretty much any use case, and then the user can rearrange himself in the archive.

I did think of the rearranging, but it was very error prone due to the separation of entity and component. An idea that would work very well is to have the archive be Entity,Component instead of just Component, as that way you could serialize that specific component and add it to a data structure to rearrange it. At the moment you can do that by an archive that knows about the order of the serialization (Entity->Component in a loop), but that could be error prone.

vblanco20-1 on 19 Apr 2018

@vblanco20-1 Ok, I would proceed this way.

First of all, I'm working on the serialization stuff so as to favor functions of archives that accept a couple entity/component over those that accept only one argument. It should make writing an archive that somehow _compress_ data easier.
Then I'll introduce an overload of the component member function to use when components should be considered in and instead of in or'd.
Finally, I would add an optional _functor_ to use as a filter if required.

What do you think about? Would you mind to help me testing them once done?

skypjack on 22 Apr 2018

👍1

Of course @skypjack . Count me to test it.

vblanco20-1 on 23 Apr 2018

🎉1 👍1

@vblanco20-1 I started thinking on this. What about this kind of API?

registry.snapshot().component<A, B, C>(archive); // deprecated and kept here for backward compatibility

registry.snapshot().component<A, B, C>(entt::or_t, archive); // (1)
registry.snapshot().component<A, B, C>(entt::and_t, archive); // (2)

Where:

Archive receives the entity and all the components at once as pointers (null if not assigned to entity). Sort of:
```
void(entity, A *, B *, C *);
```
When used to read back data, the function type has just another *:
```
void(entity, A **, B **, C **);
```
It's in charge to the archive to decide how to store data, as long as it's able to provide the loader with the same information it received.
Archive receives the entity and all the components at once as references. Sort of:
```
void(entity, A &, B &, C &);
```
When used to read back data, the function type accepts const references instead:
```
void(entity, const A &, const B &, const C &);
```
In both cases, I would add the _extra function_ already mentioned to filter entities.
With this model, entities' identifiers can still be duplicated:

registry.snapshot().component<A, B>(entt::or_t{}).component<C, D>(entt::or_t{});

In this case, if entity E has bot A and D, its identifiers is stored twice. However, the amount of data can be reduced a lot.

This should satisfy more or less all the possible cases, even though I must admit I'm not going crazy for the API.
What about? Any comments or suggestions?

skypjack on 3 May 2018

@skypjack

registry.snapshot().component<A, B, C>(entt::or_t, archive); // (1)
registry.snapshot().component<A, B, C>(entt::and_t, archive); // (2)

These API's sounds better for you?

registry.snapshot().component<A, B, C>(entt::any_t, archive); // (1)
registry.snapshot().component<A, B, C>(entt::all_t, archive); // (2)

Maybe this?

registry.snapshot().component<A, B, C>(entt::optional_t, archive); // (1)
registry.snapshot().component<A, B, C>(entt::required_t, archive); // (2)

Or this?

registry.snapshot().component<A, B, C>(entt::ptr_t, archive); // (1)
registry.snapshot().component<A, B, C>(entt::ref_t, archive); // (2)

ArnCarveris on 3 May 2018

I think that sounds solid. I would need to actually try to use it to have a proper opinion. The thing that i dont see good is the pointer vs double pointer. Wouldnt that be very error prone in practise?

vblanco20-1 on 4 May 2018

@vblanco20-1

Well, the pointers in the first version are required because not all the entities probably have all the components, being them _or'd_. The double pointers in the second version are more or less for the same reason but in the opposite direction: the archive _reads_ the available components and _passes_ them to the loader. We could replace them with a _callback_ that is invoked by the archive with a bunch of pointers. Something like this:

// from within the loader and for each entity (we know the size, that is their number after all)
archive.load([](auto entity, A *a, B *b, C *c) { /* ... */ });

Sounds better? Need to think twice on this, but it could be a viable solution.

@ArnCarveris I like any_t/all_t` actually, but they weren't the names of the tags that didn't satisfy me. ;-)

skypjack on 4 May 2018

@skypjack So in that case, you open the file yourself, and then call Load() on every entity from the file? That does look like a far more interesting way of doing it. I guess that if you send nullptr to the load function then it doesnt add the component.

This way would work quite well for the networking i wanted to do. I store all the components i care about by checking everything that has to have a "NetSync" component, plus the different sync components. Then on the other side i just load it.

vblanco20-1 on 4 May 2018

@vblanco20-1 Ok, let's have a try along this way ;-)

skypjack on 4 May 2018

@vblanco20-1

I've been a little busy in the last few weeks, I'm sorry. I just tried to implement it and I found that there is a problem.
In particular, consider something like this: snapshot.component<A, B, C>(entt:or_t{}, archive);
It looks good. However there is no chance to know what entities have either A or B or C unless we iterate all the active entities and make three calls to has for each entity. It means N*M with N entities and M components.
And it's even worse than this unfortunately. We cannot know in advance how many entities we are going to save, so there are two possibilities to maintain the snapshot class a _zero-allocations_ tool as it is right know:

A first iteration to count the entities and a second iteration to store them (this would ruin the performance even more).
Entities and components interlaced with boolean values or whatever to tell to the reader that _there is something else_. Kind of - [E1, A, B, C, true, E23, A, B, C, false]. A bit tricky and a good way to waste the space we just saved.

I'm trying to figure out if I can do it in a good way. So far, it looks like I need to radically change the approach to the problem.

skypjack on 17 May 2018

@skypjack for the allocation part, what about letting the function take a std::vector or similar data structure as imput? This was the developer can choose how to allocate that data himself. Save/restore is not something that would be done in a hot loop anyway.

For the "has" stuff, why not make it do all that "has" stuff, but coming from a view. This way you already use the view to filter your entities, and it would lower the cost of the "has" checking by a lot.. For example you would just use a view that iterates through entites with the Serialize component, and THEN do the snapshot for the serialization over that set.

That would be similar to the stuff i was planning for my network system. The system iterates over all entities with a Networked component, and then does a bunch of entity.has and similar types of components wich do have a serialize function, then i serialize all of that into a small size binary json file, and send that bitstream over the network.
With a pre-set data structure i can send to the functions, i can just allocate a X megabytes chunk of memory to act as a linear allocator and just throw it off every time the system gets executed.

vblanco20-1 on 18 May 2018

@vblanco20-1 I've still to update the README file, but I've already updated the tests.
If you take a look at branch issue_67, you can see that the snapshot class and the loaders now take two parameters, that is a couple entity/component (and entity/tag when it comes to working with tags).
This should already be on the way we discussed, but I'd appreciate your feedback and, of course, I'd like to know what you suggest to be the next step. ;-)

Thank you very much.

skypjack on 19 May 2018

@vblanco20-1 Added also a sort of _filter by entity_ to the component member function of the Snapshot class, as discussed.
It's already documented and there is a simple test for it. It's straightforward to use anyway.
Let me know what do you think about the last changes. Thank you.

skypjack on 19 May 2018

@vblanco20-1 Squashed everything on experimental so as to create a single commit. I plan to merge it on master in a couple of days and then to close the ticket. As far as I can see, I added pretty much all what we discussed. Let me know your feedback when you can. Thank you.

skypjack on 22 May 2018

Upstream.

skypjack on 23 May 2018

Was this page helpful?

0 / 5 - 0 ratings