Entt: Save/Restore

Created on 4 Jan 2018  路  54Comments  路  Source: skypjack/entt

Hi @skypjack ,
I've been disconnected for a few weeks, and when today I got back, had a look at the code, well... wow! already version 2.4.1 and a lot of new features (and most of those I already had custom-implemented because of need, like the Actor class)!

I have a side question and a possible feature request: how difficult would it be to implement a save/restore functionality in the ECS?
In the simulation system we are implementing, one of the main features we need to implement is the possibility to take "snapshots" of the entire system state, save them and restore at a later time.
More or less a classical replay system, but with multiple restore points.

Is it possible to implement that functionality in the Registry, with the current implementation?
In case you believe is a feature that is worth incorporating, please send me a PM and we can maybe have a quick call to discuss ways of helping/accelerating the development.

discussion enhancement

Most helpful comment

good to go for me!
thanks!

All 54 comments

I just realized I don't have the email in my public GitHub profile... Just sent you an email with my contact info, in case you're interested. Thx!

Well, this isn't the first time someone asks me of this feature (@morbo84? :-)).
There are chances we can implement it, but there are also some problems to face I'm not sure you spotted yet.

Let's discuss it a bit. Do you already have in mind an API?

In order to give you a grasp of the main issue, consider the following snippet:

if(time() < noon) {
    auto entity = registry.create();
    registry.assign<Foo>(entity);
    registry.assign<Bar>(entity);
} else {
    auto entity = registry.create();
    registry.assign<Bar>(entity);
    registry.assign<Foo>(entity);
}

registry.serialize();

Since when I turned EnTT in a fully runtime tool, types' identifiers are assigned at runtime as well.
In the example above, Foo has type 1 before noon and Bar has type 2. After noon, their identifiers are at the opposite.

What does it mean? You cannot just call serialize on the registry, for types' identifiers aren't guaranteed to be the same the next run.

Therefore, the API should look something like this:

registry.serialize<Foo, Bar>();

This way, I can compute a compile-time pack of identifiers that is guaranteed to be the same as long as you use the same list and the same order the next run.

It has also an advantage: you can filter types you don't want to serialize.
The drawback is that we are back to those annoying lists of types so ugly to see.

Anyway, just to know your thoughts, could it be a good approach?
I need your help for you are facing a real case and you know better than me what can be a good way to do that.

Thank you very much!! :-)

would it make sense to have some API that optionally allow to register types beforehand, such that the order is defined and does not change?
I was thinking about something like:

registry.declare<Foo>();
registry.declare<Bar>();
(...)
auto entity = registry.create();
registry.assign<Bar>(entity);
registry.assign<Foo>(entity);

And maybe only save/restore the entity types that have been explicitly declared. Could it make sense to you?

For the save/restore API itself, in the past I implemented either something like this:

bool save(int64_t key);
bool restore(int64_t key);

where the memory is managed by the framework itself, or like this:

int32_t save(uint8_t *data, size_t maxlen);
bool restore(const uint8_t *data, size_t len);

where the memory is managed by the caller.
Since EnTT is usually a part of a bigger system, probably the second makes more sense, but it's just my personal opinion here.

What about the format? Plain bytes, json, XML, custom, whatever?
Probably it would be good to provide different serializers to let the user picks up the preferred one.

Also, POD types are pretty easy to serialize, complex ones usually are not.
What about?

I can try to define something in these days probably.

most of the components we use are POD, but some are more complex and dynamic. If it will be possible to specialize the serialization/deserialization functions, it should work fine.

What about the rest of the registry state (number of entities, valid IDs, etc)?

Uhm... Honestly I thought creating a new set of entities from scratch when unserializing would have been fine.
I mean, does it make sense to keep the same entities with the same versions?
Usually, all my systems work by querying the registry and constructing their internal state from what they find there.
Do you think there exists an use case where everything must be recreated exactly as it was before to make the snapshot, also the versions of the entities?

one thing that is probably needed, is to have consistent ID when the simulation is restored.
For example if the ID of an entity had the internal value 0x1234 when I save the state, should have the same value after has been restored.
In many high-level objects I store the entity that contains the data, and if the ID value changes, that link is broken. The Actor class does the same in EnTT 2.4.1, by the way.

So, the entity number along with its version, right?
I must admit that it makes sense if you create a snapshot of the registry plus its surrounding environment. This is a kind of drawback of opaque identifiers.
Good catch.

I also think having consistent IDs is important, I don't know how you would build a serialisable attachment system for example otherwise.

@mario-deluna

I don't know how you would build a serialisable attachment system for example otherwise.

Can you elaborate a bit more? I'm interested in the argument. Thank you. :-)

Let's say I have a transformation component for joints/bones now I want to tell that component to who he is attached to. From my point of view, the simplest solution is to store the parent's entity id in that component. Then any system that needs the absolut position can simply check if the parent's entity is still valid and hop up the tree.

If the ids are not persistent this would not work from my point of understanding.

@dbacchet @mario-deluna

Options I see so far are below:

  • Save separately the list of entities in use, the list of available entities and then the list of components, each one along with the entity to which it is bound. Drawback: the entity identifier is repeated a lot of times (as an example, if an entity has 30 components, you have the id 31 times in the file) and it wastes space. Moreover saving and restoring would not be cache-friendly operations. Finally it's not that easy to fit custom serializers (json, xml, whatever) with this model.
    Something like this in an array after serialization:

    [2, e0, e2, 1, e1, 0, e2, , 1, e2, , ...]

    Where the first number indicates how many entities are still in use (2). The second number indicates how many entities exist but are not in use (1). Then we have the list of triplet component numeric identifier, entity to which it's bound, bytes.

  • Save an entity at a time with all its components and a flag that indicates if it's in use or not. It requires less space than the previous solution, it's not cache friendly while saving and restoring. Custom serializers (json, xml, whatever) would fit with this model, an hypothetical save function would receive all the information at once for each entity and that's all.
    Something like this in an array after serialization:

    [e0, 0, e1, 1, 0, , 3, , ...]

    The first entity isn't in use, the second entity is in use, its first component has identifier 0 and the given bytes, its second component has identifier 3 and the given bytes, and so on.

  • A more cache friendly solution is pushing the serialization problem down to the underlying data structures, that is the sparse sets. Then we can store at once their internal arrays. It wastes much more space than the previous solution and it doesn't fit well with custom serializers (json, xml, whatever). Moreover, entities in use and available ones should be stored separately somehow.

If you have any other proposal, feel free to post it and we can discuss it together. Otherwise, bullet 2 looks the most promising but for the fact that saving and restoring are far from being cache-friendly operations.

What about? Comments are appreciated.

anyone of those solutions could work for me; the internal format is not exposed directly, so whatever is more convenient on your side.
Being cache friendly is probably not that important (imho), because the operation is not supposed to be done every frame, at least in my use case.

What's the signature for the functions that the user has to implement to serialize/deserialize custom data? already something in mind?

I'd like to design a solution that allows full customization, so as a user can write its own serializer.

The second solution seems the most appropriate to do this. In that case, the interface would be something like this probably:

template<typename... Comp>
void serialize(Stream &stream, entity_type entity, bool in_use, const Comp &... component);

Where stream offers the (let me say) C++ way to store bytes, that is:

Stream & write(char *, size);

On the other side, if you don't want to fully specialize the serializer but you have custom types for which you want to provide your own specialization, the signature will be something along this line:

void serialize(Stream &stream, const Type & component);

Where the caller expects you serialize the component as a bunch of bytes.

Could it work for you?


One could argue that entities that are not in use have no components and thus we can differentiate with two functions in the example above. However, these signatures won't change in future if I decide to implement crazy things like lazy destroy or something like this. The other way around would break everything instead.

seems reasonable to me.
I still don't get why the first serialize is not just the internal function that then calls serialize(stream, component), but for sure is not a problem to implement that signature

I'm actively working on this.
Just a question (probably the last one).

I plan to keep intact the versions of the entities that are serialized.
However, consider this case:

  • I have a registry on which I registered three types of components T, U and V
  • Entity E4 has been destroyed and it's available for use (that is it's candidate to be returned from the next call to create)
  • Entity E3 has components T and U
  • Entity E2 has component V
  • Entity E1 has been destroyed and it's available for use (that is it's candidate to be returned from the next call to create)

I decide to serialize only components T and U.

In this case E3 is correctly serialized.
On the other side:

  • E2 would be serialized as in use but with no components at all
  • E1 and E4 would be serialized as not in use

I'm thinking to not serialize E2, E1 and E4.
What happens after a restore in this case?

  • E3 is fully restored with the correct version and all its components.
  • E1 and E2 are restored as available, with versions different from the ones they had before to save.
  • E4 isn't restored at all. It's created from scratch if required.

Can it work or even destroyed entities should keep intact their versions?

question: what's the use case of serializing only a subset of the available components?

to answer your question, if I correctly understood how it works, I believe you should save also the internal version of the destroyed entities.

Am I correct to say that only if you do that you can guarantee complete determinism?
An example to clarify the scenario:

  1. I run the game/sim for a while, and decide to save the state with flag S at time t0
  2. keep going and create/destroy entities
  3. restore S
  4. create N new entities and call function f() that iterates and calc statistics
  5. restore S
  6. (b) create N new entities and call function f() again

I would expect to have always the same result out of f(), and that function _could_ depend on the order the entities are iterated.

If you don't save the version of the destroyed entities, the IDs for the N entities created at 4. and 6. could be different, or in a different order, because they depend on the accumulated history (the version keep growing every time an entity ID is recycled, if that info is not saved).

Does it make sense?

question: what's the use case of serializing only a subset of the available components?

A lot of games allow to save only when the player reaches some _safe zones_ (as an example, I remember Resident Evil on PlayStation used this technique).
In this case, there is no benefit in serializing meshes or whatever. All what you want is probably to store player state, its inventory, the safe zone he reached and a few other things.

Does it make sense?

Yes and no.
Order doesn't depend on destroyed entities if you get a snapshot and restore it without them.
In other terms, the underlying data structures that contain components maintain the order they have before to the the snapshot.
All what you get is that destroyed entities have different versions than the ones they have before to get a snapshot probably.

That being said, order of entities is highly affected by most of the operations.
If you have a function that requires them to respect an order, I would keep track of changes and force it before to iterate, probably.

So far so good. What you are asking has a lot of drawbacks indeed.
As an example:

  • What you would expect if the registry contains an entity and also your snapshot contains it? A merge of the components or to replace completely the current set of components with the restored one? Pretty tricky, if you restore something on a different run of the simulation, the two entities could describe completely different objects, even though they have the same id, right?

  • What you would expect if both the registry and the snapshot contain an entity but versions differ? Who (let me say) _wins_? It could break things if entities refer to each other using fields in components.

  • What you would expect if the snapshot contain an entity that is marked as destroyed and the current registry contains the same entity in use because recycled?

  • And so on...

The best thing is to restore a snapshot in an empty registry. It's trivial and has no hidden corner cases.
If I got you right, you want to restore a snapshot in an already running simulation instead, right?
Well, in this case there are several points where compromises must be found. :-)

  1. restore S
  2. create N new entities and call function f() that iterates and calc statistics
  3. restore S
  4. (b) create N new entities and call function f() again

Wait a moment. You mean that at bullet 5 the restore function drops the N entities you created at point 4?
In this case, it's enough to clear the registry before to restore things. Isn't it?
Because destroyed entities have no versions stored anywhere, they start from version 0 all the time you restore your snapshot. It means that f returns the same result in your example.

I think there are 2 things that need to be implemented. Serialization/Deserialzation at the component level and Save/Restore at the simulation/game level.
They have different use cases and require different level of user awareness.

To your example of saving a part of the game at specific location: that for me belongs to the first case, where you only need to save _a subset of the data_ (=components). In this case I believe the level of control should be at the specific component level for every entity, not depending on the type (i.e. I probably only care of saving a specific set of components, not all components of type A,B,C etc).
With the serialization API you proposed that should be possible on the user side without too many problems I believe.

The purpose of Save/Restore is to get a snapshot of _the entire_ game state (component data and internal states). When you restore 'S', the state of the entire system should be binary identical to the moment you saved. At this level there are no conflicts on what to restore and what not. The snapshot _is_ complete.
My goal is to have a completely deterministic system, so if I restore 100 times the evolution of the system will always be the same, independently of what happened _after_ the restore (create/delete entities, etc.)

I should have been more clear on the initial request, hope that this helps :)

Yep, definitely. Thank you very much. Now it's clear what you need. ;-)
Stay tuned. I'll push a dedicated branch soon with the first version of the save/restore functionality.
I'll write also a bunch of tests. However, if you can test it and give me a feedback, it would be appreciated. ;-)

I will and I will try to break it writing even more tests :) I believe it's the minimum I can do, given all the efforts you're putting into this

Let's discuss about the snapshot branch.
I'm developing it for I need it in a project of mine and it must be considered still a work in progress, but it gives a grasp of the final API.
Mostly _Cereal C++_ oriented, for I liked its API. What about?

@dbacchet I'm reviewing a bit the _progressive_ part, but the _snapshot_ part is _ready to be judged_.
Whenever you want (and you find the time to do that), feel free to leave here your comments. ;-)

So far so good. I have yet to document it, but I'm tempted to say that _it works as intended_. :-)
I'll try to write a few notes in the README as soon as possible. In the meantime, tests and comments are welcome as usual.
Thank you all.

that's fantastic news. I'll spend some time experimenting during the weekend!

Updated the readme file with a dedicated section (still to be reviewed, feedback are welcome).
I have still to write documentation for a couple of classes, but I'm almost ready to merge with master and release it.

Any comment so far?

The snapshot branch contains also an example of use with Cereal C++ as an archive.
I hope it works fine for you and fits your needs, guys. Enjoy. :-)

@dbacchet
Because of #40, this change cannot be considered _ready to merge_ anymore.
However it will take a few days to be ready again, don't worry. ;-)

Hi @skypjack,
I started adding some more unit tests for the save/restore functionality and I ran into a (possible) bug/problem.

I created a gtest with multiple snapshots and restore points, where I also evolve the state and create/destroy entities in between.
It mostly works as expected, but I got an assertion when trying to restore the second snapshot on the same destination registry:

[~/dev/tmp/entt_dbacchet/build] $ test/snapshot
Running main() from gtest_main.cc
[==========] Running 3 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 3 tests from Snapshot
[ RUN      ] Snapshot.Full
[       OK ] Snapshot.Full (0 ms)
[ RUN      ] Snapshot.Continuous
[       OK ] Snapshot.Continuous (0 ms)
[ RUN      ] Snapshot.SaveRestore
Assertion failed: (!registry.capacity()), function SnapshotLoader, file /Users/davide.bacchet/dev/tmp/entt_dbacchet/src/entt/entity/snapshot.hpp, line 182.
[1]    69429 abort      test/snapshot

It's very likely that I'm missing some required call before restoring, but would be great if you can have a look and advise.

The code is in my EnTT fork: snapshot_saverestore.cpp.
I wanted to make a PR at the end to extend the test coverage, but I can make it now if that's easier for you.
Thanks!

I knew sooner or later I had to blame myself for the wording used to name that function: reset!!

I had only a glance at your code from the mobile app, but I'd say that the _problem_ is within your entt_restore function.
Unfortunately (or fortunately, it depends from the point you look at it) registry.reset() doesn't work as you expect. It doesn't trashes the content of a registry. Instead it destroys all the entities it contains and updates their versions. This way you can reuse the registry even if you are storing identifiers all around because they'll be invalidated.
On the other side, restore expects the registry be empty. No entities, neither in use nor destroyed.

Probably everything works fine if you replace the call to reset with registry = {};. :-)


As a side note, I found a bug that I've already fixed (still to push actually, it's part of the changes for issue #40).
If you try to restore destroyed entities before to invoke entities, it can crash under certain circumstances.
If you experience such an error, don't worry.

that works, thanks.
I will add a few more tests and create the PR soon :)

Question/request: since it possible to detect that the user is trying to do a full restore, can the registry be "reset" (i.e. set in a state that allows the restore) automatically?

The sole drawback I see is that one can happily trash all the data in a registry without even being notified. :-)
The assert is a way to say - _ehy, you are going to mess things up_.
However, you know: you asked it, you rule on it. Do you prefer a silent reset during a restore?

yes please. Either that or an explicit return flag that says the restore failed. If for some reasons asserts are disabled (like in a release build) I feel it's easy not to detect the error. Not a big deal though, because at the first run with assertions enables it's easy to spot the problem.

Another question: I added a test with a custom archive, and I ended up creating 2 simple classes like these:

template <typename EntityT>
class CustomOutputArchive {
public:
    // serialization functions
    void operator()(EntityT ent) {
        // std::cout << "ent: " << ent << std::endl;
        entity_vec.push_back(ent);
    }
    void operator()(const State &c) {
        state_vec.push_back(c);
    }
    void operator()(const StateDot &c) {
        statedot_vec.push_back(c);
    }

    std::vector<EntityT> entity_vec;
    std::vector<State> state_vec;
    std::vector<StateDot> statedot_vec;
};

template <typename EntityT>
class CustomInputArchive {
public:
    CustomInputArchive(const std::vector<EntityT> &ev, const std::vector<State> &sv, const std::vector<StateDot> &sdv)
    : entity_vec(ev), state_vec(sv), statedot_vec(sdv) {}
    // deserialization functions
    void operator()(EntityT &ent) {
        ent = entity_vec[entity_vec_idx_++];
    }
    void operator()(State &c) {
        c = state_vec[state_vec_idx_++];
    }
    void operator()(StateDot &c) {
        c = statedot_vec[statedot_vec_idx_++];
    }

    size_t entity_vec_idx_   = 0;
    size_t state_vec_idx_    = 0;
    size_t statedot_vec_idx_ = 0;
    std::vector<EntityT> entity_vec;
    std::vector<State> state_vec;
    std::vector<StateDot> statedot_vec;
};

I would love to have a single class that can both serialize and deserialize the registry, but the operator()(EntityT) in the output archive and operator()(EntityT &ent) in the input archive are ambiguous if they're defined in the same class.
Any chance to move from operators to members with an explicit name (like save(const T&) and restore(T&) for example)? I believe you would lose compatibility with Cereal archives though...

Not a big deal if you prefer to keep the existing API, it's easy enough to create a common serializer/deserializer object and then wrap it through 2 independent classes that provide just the required versions of the () operator.

I think I can easily introduce a couple of proxies to support also the (let me say) _mono archive_. With a bit of sfinae machinery on the API we can accept both the types of archives probably.

please just keep one (the one you like the most). There's no reason to support both API in the framework itself if this is something that can be easily implemented on the user side or in a utility function (and we both agree it's easily doable). Thanks!!

You're welcome. If it's fine for you, I'd prefer to keep onboard the cereal-like interface because it's the tool I usually use then. Moreover I think it's easier to reason with different archives while implementing offline save/restore functionalities (at least for me). Hope it works fine for you.

Updated the poc on the branch snapshot.
Please, let me know if you find any bug. Thank you.
I'm at your disposal for further request.

@skypjack . Fantastic. Thank you again for your dedication.
Errr... I do not intend to trick you but i think it may be interesting to open new sections / subsections to document new features such as serialization (a very powerful and notable feature) . At the moment EnTT is very alive and that is excellent news except for the documentation. You probably prefer to wait for the new version to be publicly released, ( and tested and verified it is useful and stable).

@DJuego Forgive me, I didn't get you. This feature is already documented in the readme file. You can find the documentation on the right branch (snapshot). Of course, it's still a kind of work in progress, but this time I'm trying to keep updated code, tests and documentation all at once. What's wrong exactly?

Sorry. Sorry. My fault! Yes, yes. I found it! It looks great, @skypjack !!

@dbacchet

Any news on this?
If you have any request for change or you found bugs, let me know so that I can look into them.
I'll release v2.5 as soon as this feature is ready, merged upstream, documented and fully tested.

Thank you for your help, really appreciated.

good to go for me!
thanks!

So far, so good. Probably the last changes broke the compilation on VS (because of its broken compiler, the code for itself is fine indeed).
I cannot test it right now and the fix should be straightforward, so not a problem actually.
During the week I'll fix it and merge everything on master. Then I'll create a tag for EnTT v2.5.0.
From that moment on, any issue or request for change must have its own ticket.

Thank you all for your help.
I think this is one of the major changes of the new release.

Thank you @dbacchet for the request, your support and your help in testing it. Really appreciated.

@dbacchet

Upstream on master.
If you confirm that everything works as expected after the merge, I can close the ticket and create a v2.5.0 tag.
I'll also delete the snapshot branch, so feel free to get rid of it.
Thank you very much once more for your help.

Again. thank you for your amazing efforts, @skypjack

This statement in the README is striking: "The code is not production-ready and it isn't neither the only nor (probably) the best way to do it. However, feel free to use it at your own risk." :-P

@DJuego Ahahaha. It refers to the code of the test made with Cereal C++ and not to the code of the save/restore stuff!! Don't worry. ;-)


Edit.

I just noted there is an extra space that one can misinterpret actually.
Good point. I must fix it. Would you open a PR? Otherwise I can change it later.

@DJuego Fixed the doc, thank you for pointing it out.
I'm closing the issue as a consequence of the merge with master. Feel free to create new issues in case of problems.
I'll create tag v2.5.0 as soon as @dbacchet confirms that everything works fine (or in a couple of days in case of no feedback).

Thank you very much to all of you for the help and the support. Really appreciated.

I was looking at this. Could this be used for networking? basically use the serialization to convert some entities with a specific "NetSerialized" component or similar, and then ship them off to another pc. Is it easy to use this serialization code to basically grab "entities with X component, save ALL their components to the archive" in a smooth way? Im going to perform some tests with it myself, but wanted to know if you have some pointers.

@vblanco20-1

Yep, definitely. You can choose between a snapshot loader or a progressive loader so as to continuously patch local data with the ones from the server.
Serialization is straightforward actually. There is also an example that uses Cereal C++.
If you have any feedback, it would be really appreciated!!

Take a look at the README file.
There is a section dedicated to the topic.
Let me know if it's clear enough.

@skypjack i went to try to implement a JSON save/load for it (custom one, not like the one you got in the test). I see that it stores an entity and then the component. It stores all the components that you tell it.
Is there a possibility to limit it to a subrange of entities? maybe a query like the ones in a system? In my example, i might want to replicate the Position component, but only on the entities that have a NetSerialize component. A interesting thing would be to have the snapshot to only be done on an array of entity IDs. I could do a view to see all the entities that have NetSerialize, and then only run the snapshot on those.

Alternatively, i could just do more "manual" serialization by having a Serialize system that reads NetSerialize and then adds every component it can (using has()) into a binary array or similar.

There is also quite a lot of wasted space just by all the times that entity id is serialize, wich is N + N*NComponents. Do you think a way to flip the serialization around so it ends up like "Entity1-C1-C2-C3""Entity2-C1-C4" would be possible? While this would definitely be more complicated, it would provide a huge boost to space.

@vblanco20-1

Can I ask you to open another issue for this? Sort of _More on Save/Restore_.
I've already put some notes in the todo file for I want to _extend_ the save/restore functionality. I think it's worth it discussing the new features in a new thread.

If it's ok for you, feel free to just copy-paste your last comment.
It contains interesting points to which I'd like to reply so as to propose possible implementations.

Thank you very much.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

skypjack picture skypjack  路  6Comments

skypjack picture skypjack  路  4Comments

Qix- picture Qix-  路  6Comments

bjadamson picture bjadamson  路  4Comments

Kerndog73 picture Kerndog73  路  5Comments