Bevy: Improved Scene Format

Created on 5 Aug 2020 · 13Comments · Source: bevyengine/bevy

@Moxinilian and I have been experimenting with an improved scene format:

https://gist.github.com/Moxinilian/c45a1858eca7e918b5728ee5c117f649
https://gist.github.com/cart/3e77d6537e1a0979a69de5c6749b6bcb

The current scene format is workable, but it isn't yet ideal for manual scene composition because it is a flat list of unordered entities and it isn't particularly ergonomic. I also want to add nested scenes.

assets enhancement needs guidance

Source

cart

Most helpful comment

We can use the RON syntax even if we can't make the existing parser work for us. We could fork the RON crate or build our own if the features we need are not general enough to justify upstreaming. (But hopefully we won't need to)

I'm open to using RON syntax, but I would prefer it if we designed our ideal format first and then decide the best way to implement it.

Is this file meant for humans to read or tools/editors to read? My experience has been that file formats are usually good at just one of those things. Formats that try to do both well generally do nothing well.
Maybe we design a format that is human friendly today and plan to have an alternative replace or co-exist in the future that is tool-friendly but still diffable (i.e. not binary).

Yeah I want to focus on a human-friendly format first and building tooling around that. Then we can decide if there are pain points that require a second format. I'd prefer one format just for simplicity though. I'd be curious to see a case where a human readable format like the ones above _isn't_ a good fit for tooling.

Using a scripting language as configuration has some attractive aspects to it, but tools won't be able to round-trip edit it. Maybe that's fine if we accept another format that can be round-tripped would have to co-exist
I lean towards tabling the script as configuration idea for now, and note for the future that this is an important use-case for any future script system we might consider

I agree that we should table it for now. Too much complexity for marginal wins. And I agree that it would make writing tools around the format harder.

We need to consider renamed/deleted types and added/renamed/deleted/reordered fields. It's kind of reminding me of Django's schema updating system - they have a simple tool that automates data migration.

I've used something similar with C# databases (Entity Framework "migrations"). I agree that a migration tool is nice to have, but in the short term I think I would be willing to settle for returning nice errors when a required field is missing, when a type doesn't match, or when a field is set but it doesn't exist.

In the prototype I worked on with @kabergstrom, we associated UUIDs with types and used that instead of names. This is great for tools but not ideal for hand-editing.

Yeah I have a very strong preference to use type names in scenes (and short names where possible). UUIDs do solve the uniqueness problem, but i dont think they're worth the price of legibility and hand-composabiltiy. We should error out when a type name is used in a scene but isn't registered. And we can solve "potential type ambiguity" by either adding an "import" system to scenes or by forcing developers to resolve ambiguity when they register types in their apps.

At scale, it needs to be possible for scenes to be split across files based on teams that will be editing them. i.e. people placing audio elements shouldn't have to deal with file conflicts with people placing terrain. (it should however be possible to load them both as separate layers, one as read-only and the other as read-write)

Agreed. I really like how godot handles scenes and plan on using many of their ideas:

Scenes can be spawned at any time
Scenes can exist next to other scenes
Scenes can be removed at any time (without affecting the scenes around them)
Scenes can be nested. Scene A can pull in one or more instances of Scene B
Nested scenes can have values overidden in their parent scenes. Scene A can override the properties of components in a nested Scene B
Default component values are not recorded (this allows scenes to adapt to changes in Component defaults)
Scenes can define local versions of assets (ex: a scene could inline a Material asset for use within the scene)

It may be a bit early for this, but we should keep level streaming in mind - that a game might have multiple scenes loaded at the same time, dropping one and loading another

Makes sense. Having multiple scenes is definitely a goal for me (see godot comment above)

I think it makes sense to unify the concept of a "scene" and a "prefab" - i.e. you can spawn multiple instances of the same scene/prefab with position offsets

Agreed (see my comment about godot scenes above)

cart on 22 Aug 2020

👍2

All 13 comments

If you need a parser for the scene format, I could probably do that in my free time. I have some fun with parsers. :smile:. If you don't mind it for some reason my favorite parser crate is rust-peg.

zicklag on 12 Aug 2020

If you don't wanna make a new format from scratch, or just wanna make something a little less domain specific, consider using Rusty Object Notation (RON).

Syntax highlighting is one thing that springs to mind that would be easier with an existing format.

madsmtm on 12 Aug 2020

Bevy already uses RON, but there is an issue with de-serialising types not given by the caller, I think that's why components are currently all hashmaps/objects.

karroffel on 12 Aug 2020

👍1

We can use the RON syntax even if we can't make the existing parser work for us. We could fork the RON crate or build our own if the features we need are not general enough to justify upstreaming. (But hopefully we won't need to)
Is this file meant for humans to read or tools/editors to read? My experience has been that file formats are usually good at just one of those things. Formats that try to do both well generally do nothing well.
- Maybe we design a format that is human friendly today and plan to have an alternative replace or co-exist in the future that is tool-friendly but still diffable (i.e. not binary).
- Using a scripting language as configuration has some attractive aspects to it, but tools won't be able to round-trip edit it. Maybe that's fine if we accept another format that can be round-tripped would have to co-exist
- I lean towards tabling the script as configuration idea for now, and note for the future that this is an important use-case for any future script system we might consider
We need to consider renamed/deleted types and added/renamed/deleted/reordered fields. It's kind of reminding me of Django's schema updating system - they have a simple tool that automates data migration.
- In the prototype I worked on with @kabergstrom, we associated UUIDs with types and used that instead of names. This is great for tools but not ideal for hand-editing.
At scale, it needs to be possible for scenes to be split across files based on teams that will be editing them. i.e. people placing audio elements shouldn't have to deal with file conflicts with people placing terrain. (it should however be possible to load them both as separate layers, one as read-only and the other as read-write)
It may be a bit early for this, but we should keep level streaming in mind - that a game might have multiple scenes loaded at the same time, dropping one and loading another
I think it makes sense to unify the concept of a "scene" and a "prefab" - i.e. you can spawn multiple instances of the same scene/prefab with position offsets

aclysma on 20 Aug 2020

This is an example of what we were loading in the prototype @kabergstrom and I were working on. We had this loading/editing/saving in a prototype editor. I'm not necessarily proposing it for adoption - just as an existence proof that you can dynamically load component types with RON.

Prefab(
    id: "2aad7b4c-a323-415a-bea6-ae0f945446b9",
    objects: [
        Entity(PrefabEntity(
            id: "e938c98b-df7e-41d7-a2e4-9f028702b022",
            components: [
                EntityComponent(
                    type: "46b6a84c-f224-48ac-a56d-46971bcaf7f1", // <-- We used UUIDs to in theory avoid rename issues
                    data: MeshComponentDef(    // <-- Although a name still shows up here :)
                        mesh: Some("36c00bf1-255e-4537-bddb-fdbee5548db2"),
                    ),
                ),
                EntityComponent(
                    type: "35657365-bb0c-4306-8c69-d5e158ad978f",
                    data: TransformComponentDef(
                        position: Vec3(0, -0.30648625, 0),
                        rotation: Vec3(0, -1.6, 0),
                        scale: -0.01,
                        non_uniform_scale: Vec3(1, 1, 1),
                    ),
                ),
            ],
        )),
    ],
)

Also another thing we did was separate the concept of serializable and non-serializable components. i.e. if you wanted a RigidBodyComponent, you would need a serializable SphereRigidBodyComponetDef (that might define the radius of the sphere, friction coefficient, etc.) and then a RigidBodyComponent that existed only at runtime that could have a handle to the rigid body in the physics system. (Which obviously.. that handle would not be possible to persist.) This allowed us to have editor-friendly types (like Euler-angle rotations) get transformed into runtime-friendly types (like a simple 4x4 matrix). This approach of separating design-time representation from runtime representation was a key outcome of this R&D effort and made implementing components that rely on complex systems like physics or FFI straightforward: https://community.amethyst.rs/t/atelier-legion-integration-demo/1352

aclysma on 20 Aug 2020

We can use the RON syntax even if we can't make the existing parser work for us. We could fork the RON crate or build our own if the features we need are not general enough to justify upstreaming. (But hopefully we won't need to)

I'm open to using RON syntax, but I would prefer it if we designed our ideal format first and then decide the best way to implement it.

Is this file meant for humans to read or tools/editors to read? My experience has been that file formats are usually good at just one of those things. Formats that try to do both well generally do nothing well.
Maybe we design a format that is human friendly today and plan to have an alternative replace or co-exist in the future that is tool-friendly but still diffable (i.e. not binary).

Using a scripting language as configuration has some attractive aspects to it, but tools won't be able to round-trip edit it. Maybe that's fine if we accept another format that can be round-tripped would have to co-exist
I lean towards tabling the script as configuration idea for now, and note for the future that this is an important use-case for any future script system we might consider

I agree that we should table it for now. Too much complexity for marginal wins. And I agree that it would make writing tools around the format harder.

We need to consider renamed/deleted types and added/renamed/deleted/reordered fields. It's kind of reminding me of Django's schema updating system - they have a simple tool that automates data migration.

In the prototype I worked on with @kabergstrom, we associated UUIDs with types and used that instead of names. This is great for tools but not ideal for hand-editing.

At scale, it needs to be possible for scenes to be split across files based on teams that will be editing them. i.e. people placing audio elements shouldn't have to deal with file conflicts with people placing terrain. (it should however be possible to load them both as separate layers, one as read-only and the other as read-write)

Agreed. I really like how godot handles scenes and plan on using many of their ideas:

Scenes can be spawned at any time
Scenes can exist next to other scenes
Scenes can be removed at any time (without affecting the scenes around them)
Scenes can be nested. Scene A can pull in one or more instances of Scene B
Nested scenes can have values overidden in their parent scenes. Scene A can override the properties of components in a nested Scene B
Default component values are not recorded (this allows scenes to adapt to changes in Component defaults)
Scenes can define local versions of assets (ex: a scene could inline a Material asset for use within the scene)

It may be a bit early for this, but we should keep level streaming in mind - that a game might have multiple scenes loaded at the same time, dropping one and loading another

Makes sense. Having multiple scenes is definitely a goal for me (see godot comment above)

I think it makes sense to unify the concept of a "scene" and a "prefab" - i.e. you can spawn multiple instances of the same scene/prefab with position offsets

Agreed (see my comment about godot scenes above)

cart on 22 Aug 2020

👍2

Agreed. I really like how godot handles scenes and plan on using many of their ideas:

Yes! I really didn't like Godot's node/object oriented design ( which replace with ECS :tada: ), but their scene and prefab design was awesome, where every scene can include instances of other scenes and be split out to other files. That was great.

zicklag on 22 Aug 2020

I'd be curious to see a case where a human readable format like the ones above isn't a good fit for tooling.

Cases where designing for direct human-editing makes tooling more difficult:

Things that are hard to round-trip edit:
- Free-form script (example: Using logic (like embedding lua) to create entities)
- Multiple ways to do roughly the same thing - round-trip back to a file become ambiguous how to write it back and requires implementing multiple ways of reading and writing the data (example: supporting both inlined assets and referenced assets)
References to things by name or file paths:
- This leads to error-prone breakages and conflicts. Using UUIDs avoids the errors in the first place - which avoids needing tooling support to handle those issues
Formats that are slower to load in tools:
- The way you would intuitively cluster related data into files doesn't always line up intuitively for tools vs. humans. (i.e. humans might want conceptual boundaries and tools would be happy to distribute the data according to XY grid position.)
- Binary would load much faster (although I think we can all agree binary is already a non-starter since it can't be diffed)

Slightly switching gears from "GUI editor tools" to "tooling support with scene files" but probably also worth mentioning:

Compatibility with existing tools
- Using a non-standard format means existing formatters/linters/syntax hightlighters and helpful editor features like auto-indent and auto-closure on brackets, parenthesis, etc. might not work out of the box

I'm not saying these things are blockers. They're just trade-offs that we should keep in consideration. Many are worth it or can be mitigated.

It is likely worth the extra work and additional UX complexity to support both inlining and referencing sub-objects.
Similarly in atelier-assets we support referencing by BOTH file path and UUID. (Although file path can be tricky because one file can sometimes generate more than one asset).
The performance issues can be mitigated by cooking the data for released builds

Maybe supporting UUIDs and names would work for types/fields too (i.e. accept both UUID or names). Names could be baked out in "fully cooked" builds so that the engine is only dealing with UUIDs. This also avoids leaking names in a released build.

Sorry, this was probably a longer reply than necessary - I think we agree on everything except possibly UUID vs. names, maybe we can come up with a good solution for that (does allowing both in dev and cooking names out in released builds seem reasonable?).

aclysma on 22 Aug 2020

Cool the examples you provided all seem valid, but I agree that they aren't blockers and are largely "worth it", especially in the short term. Things like "faster file loading" will be largely in the noise. And if we ever hit a point where someone cares because they're loading Skyrim-sized scene files, we can build a new efficient format.

"UUID or names" seems like a pragmatic choice. It increase implementation complexity slightly, but it lets people choose what makes sense for their project. I personally think we should default to "names" for scenes generated in the editor because I want "human friendliness" by default. But thats probably out of scope for the current conversation.

I think "baking out' names might also need to be a configuration option. Some people might want their released scenes to be human readable.

cart on 22 Aug 2020

(but yeah in general it sounds like we're aligned)

cart on 22 Aug 2020

And if we ever hit a point where someone cares because they're loading Skyrim-sized scene files, we can build a new efficient format.

I would like to add in here that for projects like Arsenal, you will probably for the large majority of scene files create never be editing them by hand. In this case, with large and/or complex scenes I think it would make sense to have a more compact binary format that could be rendered to, similar to how Armory3D did it. You could choose to export scenes as JSON for debug purposes or use a messagepack-based binary representation.

I think it would be good if Bevy could optionally load a machine-only binary representation for compactness and speed. If we are using Serde to accomplish the serialization ( which I'm assuming we would be? ) we could probably just use a format like CBOR.

We don't want to limit the ability to edit scenes by hand in any way, but I don't think that this effects that negatively.

zicklag on 22 Aug 2020

References to things by name or file paths:

This leads to error-prone breakages and conflicts. Using UUIDs avoids the errors in the first place - which avoids needing tooling support to handle those issues

Expanding on this, it is probably undesirable to require users to assign unique IDs by hand to each entity or other thing that can be referenced in a scene file, while having a unique ID is a requirement to be able to implement things like field overrides when spawning a nested scene. So in this case, mixing hand-crafted and tool-generated scene files will possibly compromise features that require unique IDs for things in them.

kabergstrom on 24 Aug 2020

👍1

I'm actually going to take back what I said above about Arsenal probably wanting a binary format for scenes. If nobody else needs it don't worry about it for the Arsenal use-case because Arsenal will almost surely have its own scene format and prefab system to be as compatible and seamless with Blender as possible.

zicklag on 22 Sep 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Bevy Logo & Identity Iteration

thefuntastic · 23Comments

Android Support

cart · 18Comments

Breakout example crashes when limited to a single CPU

gdox · 13Comments

Entity Ids currently have a risk of collision

cart · 29Comments

DX12 backend renders various things incorrectly

cart · 14Comments