I realized that there is probably room for further optimizations in the multi components views. However, if confirmed, it would require to sacrifice all kind of sort functionalities.
I'm not that sure it's worth it to be honest, therefore I'm creating this issue to know if the possibility to sort pools is a must have feature for the users of this library.
I'm at the point in which a few cycles are maybe a good price to pay for more functionalities, but I'd like to know also what's your point of view.
Let's discuss it. I'll close the issue in a few days as usual.
I'm sorting my entities which have a Renderable component by Position and Layer in a 2D isometric project in order to get the correct graphical overlapping of them right. It was the easiest way to achieve this. Perhaps there are alternatives?
Only slightly related: I am not quite sure yet how I can integrate EnTT with quadtree based algorithms which also imply some kind of sorting.
@skypjack What sacrifice?
@m-waka
Perhaps there are alternatives?
Actually I'm doing the same in a couple of software of mine. The other way around is to keep a separated data structure you fill with entities and on which you infer an order.
quadtree based algorithms which also imply some kind of sorting.
See above. There is nothing wrong in maintaining separated data structures aside the ECS to do what doesn't fit well with an ECS, after all.
@ArnCarveris
All sort functionalities would be removed (see the registry class template).
@skypjack Uhh, I liked that :cry:
Yeah, indeed it's a feature that usually an ECS doesn't offer and probably it's worth it to sacrifice a bit of performance for that in a library that is already amazingly fast. I'm from the party "features for performance" to some extents.
Will removing the sort functionality mean that EnTT won't be able to keep components ordered with respect to each other for cache friendly iteration?
@danielytics
Well, yes, in a sense it's so, on the other side there is the optimization on multi component views that is in that direction. Therefore they should compensate somehow.
However, as an example, the optimization will also prevent to use different memory pools for different types of components, that is something that could be useful (future development).
I'm not sure it's worth it honestly. It looks like I'm going to sacrifice flexibility for a few cycles.
Based on what you said, it doesn't sound worth it. At least, paying a few cycles to allow for more flexibility seems like a worthy trade. EnTT is plenty fast anyway and while I would welcome even more performance, EnTT is not the bottleneck for me.
@skypjack what is your planning for that new architecture? im doing some experiments to see if a unity-style block ECS could reach high speeds on my own. On the sorting part, i dont think sort is worth it, there are some places you could use it, but they are mostly for multithreading execution purposes, wich i think would be a more important feature (also, the sort you had wasnt a parallel sort wich is a lost opportunity)
@vblanco20-1 yeah, it's not that easy to turn sorting functionality in a parallel sort actually, because of how a sparse set works. Probably one can achieve it, but I'm not sure it's worth it. Moreover, the underlying algorithm should be replaced for sure, mainly because now it's optimized for single thread execution and hardly we can make it parallel.
im doing some experiments to see if a unity-style block ECS could reach high speeds on my own.
And the results are? I'm curious, feel free to involve me if you want. You know, these things are of interest for me.
multithreading execution purposes, wich i think would be a more important feature
Well, iterators are all at least forward iterators, so EnTT can easily be used with parallel stuff. But you know it, we were discussing this point together if I remember right.
Can I ask you what do you expect from EnTT along this line? Apart a job system I'm already working on in the free time, there is nothing in the TODO list on this point. Suggestions are welcome.
im doing some experiments to see if a unity-style block ECS could reach high speeds on my own
Tbh I don't like much the approach based on archetypes and one can already simulate it to some extents with EnTT working on the barrier. At the moment I'm experimenting to see if I find a good way to speed up a bit multi component views. It would be great if I find something that doesn't affect other features, but it seems it's not possible.
Do you have any suggestion on this side? I'd be glad to listen to you, if you want to contribute.
@skypjack my experimental thing is way too early to really have good data, but the performance of iteration on multi-component views is quite good on its unoptimized state. At the moment is slower than entt.
What i would expect is basically a bit better multithreading support. An ECS should have "parallel-for" as a first class feature (i think). With EnTT i ran into some scalability issues with the parallel fors, as there were copious amounts of false sharing (few cpus writing to the same cache line) wich slow up scaling a bit.
Ill keep experimenting with my archetype-block based approach and try to gather real data on how the different methods stick against each other, using my spaceship battle thingy as a benchmark.
Edit: Forgot about the most critical feature by far. Negative query. Find entites that have components A, B, but not C. Thats extremelly important to have.
@vblanco20-1
It would be great if you keep me informed and be back with the results. EnTT would benefit from them and we could try to work their way into the library instead of writing a new one. My two cents.
With EnTT i ran into some scalability issues with the parallel fors, as there were copious amounts of false sharing (few cpus writing to the same cache line) wich slow up scaling a bit.
Just out of curiosity, do you have any idea about what could be the cause? It would help trying to figure out how to solve it. Thanks.
@skypjack i dont know enough of the internals to know what exactly can cause it.
@vblanco20-1 it makes sense. I'll keep watching your work and wait for your feedback and results to try to improve even more EnTT. Your help would be really appreciated, it seems to me you're way more skilled than me on this topic. :+1:
@vblanco20-1
i dont know enough of the internals to know what exactly can cause it.
Ok, I thought at it and probably we can spot the root cause easily with a few questions:
Are you experiencing same effect when iterating single components?
More important, were you iterating multiple components with different orders in their pools?
If the answer to the second question is yes, well... this is the right issue in which to discuss this and proposed changes are in order to improve exactly that aspect.
Time to close this issue. I think I'll end removing persistent views and introducing a slightly finer system to get even better performance on _hot groups_.
Stay tuned. I'm working on this in the free time, but I'm still trying to figure out what's the best approach so as not to ruin performance on other features, with the goal of keeping intact all the already existing features. :+1:
@skypjack will removing persistent views mean that there are no more views with random access iterators? I was relying on that for use with TBB as it鈥檚 parallel_for doesn鈥檛 work on just forward iterators.
I may replace my use of TBB with c++17 parallel for and cpp-taskflow for tasks, in which case, I don鈥檛 need random access iterators anymore. But maybe others also rely on it?
@danielytics
Wrong terms, my fault, there is a misunderstanding.
I'll _remove_ persistent views the way they work right now, because they speed up a lot the lookup of entities but you're forced to rely on indirections to access components then, mainly because components aren't ordered the way you would them to be.
The plan is to design something aside that improve overall performance, not only when you iterate entities.
For what is worth, it won't be Unity-like archetypes, because I don't like them.
They are mostly targeted to users that don't know what they want or that are looking for a tool that trades transparently memory and performance on less used features (ie assign/remove) to prepare enough runtime data to give you good performance on all kinds of iterations, also those in which you aren't interested.
EnTT is a _pay for what you use_ tool. Therefore, it will offer you a method to tell it what types of iterations on multiple components you want to optimize further. I'll be explicit in the documentation about what it means and what you lose for that.
The basic idea is that you must start from the systems and their access patterns to optimize. Optimizing everything because _you don't know what's going on_ isn't a good choice for me. As I said more than once, EnTT is targeted to experienced users and requires you to know what you want and to what extent, my plan isn't to give users a tool that _does everything somehow_.
As a side note, if you followed the discussion on gitter, iterations on single components are already faster than with archetypes based model, as expected (unless you're using a compiler that makes difficulties to optimize template machinery, but I'm confident someone will improve it in future). Because of this, single component views won't be involved by the changes probably.
Most helpful comment
Based on what you said, it doesn't sound worth it. At least, paying a few cycles to allow for more flexibility seems like a worthy trade. EnTT is plenty fast anyway and while I would welcome even more performance, EnTT is not the bottleneck for me.