Entitas heavily uses arrays. Most of the api also expose this fact.
Some functions return copies of the internal data structures with .ToArray() to protect the internal state from external modification which is very important.
I was wondering if I should change the whole api to IEnumerable to reduce the amount of converting to arrays. The user can call .ToArray if need otherwise we might be able to save some conversions. I haven't checked the code and the consequences yet, but it might be a good idea.
What are your opinions? Please share below
Group's GetEntities can result in big gc allocation for game when hunderds of entities with multiple components are created and destroyed every second. 1 rule: don't change collection returned by group.
I'm still on source code version and use this code, with arrays i had up to 150kb gc every second.
List<TEntity> _entityListCache = new List<TEntity>(64);
public List<TEntity> GetEntityList()
{
if (_entitiesCache == null)
{
_entityListCache.Clear();
_entityListCache.AddRange(_entities);
}
return _entityListCache;
}
Exactly. Returning IEnumerables might keep us flexible which data structures we use and if we want to do caching or not. Returning IEnumerable prevents (not 100%) you from modifying the collection too and encourage to use copies with collection.ToArray() when you actually need to change things
In Unity 5.5 GetEnumerator no longer allocates memory (all foreach loops are now justified), so I think it is worth a try .
I also did a small optimization thing in Entitas source with cached arrays. There was no .ToArray at the end, it just refilled the array with actual values and returned it (along with an int length. As I knew that in my case those arrays don't go out of the Execute method of systems, I just used that one single array as a temporary buffer for every GetEntities on a group. Iterating over an array is the fastest way to go over a collection in C#. Everything else (lists, enumberable collections, etc.) becomes much slower when the number of elements in them increases, because it involves method calls that have their overhead (indexing a list is actually calling a get_Item method that is generated for the list indexer
I did a performance test.
Same code - only return type changed from int[] to IEnumerable
void Update() {
foreach(var i in _ints) {
}
}
// test 1 return int[]
// test 2 return IEnumerable<int>
IEnumerable<int> getInts() {
const int n = 1000000;
var ints = new int[n];
for(int i = 0; i < n; i++) {
ints[i] = i;
}
return ints;
}

Array - 9ms
IEnumerable - 62ms
I like to return copies of the internal data structures to prevent external modification. Most of the time the only thing is only iterating over those copies, so it would be safe to return the actual internal data structure as IEnumerable and skip the additional allocation. If you want to modify, you鈥檇 have to call .ToList() (or others) yourself. But I guess I keep my current design which also allows me to cache and reuse collections. From an api perspective, IEnumerable would be better and safer in case the internal structure changes. But after seeing the performance benefits, I won鈥檛 change the api to use IEnumerable.
Will close and cry 馃槶
What about something like group.GetEntitiesNonAlloc(IEntity[] result)? Like Unity does it in https://docs.unity3d.com/ScriptReference/Physics.RaycastNonAlloc.html
Just passing in a raw IEntity[] array will lead to a lot of unnecessary copying. I once added some performance counters to find out how often I get cached results vs a fresh allocation over all groups in my game and 98% was cached results.
So I would go one step further and add an actual cache class that the user can pass in and that gets registered with the group. The group can then invalidate the caches like it does invalidate it's internal cache with the difference that the external cache just sets a flag or something and doesn't need to reallocate the next time it's used. The user would be responsible for checking the flag and refreshing the cache before it's actually used.
Most helpful comment
Exactly. Returning IEnumerables might keep us flexible which data structures we use and if we want to do caching or not. Returning IEnumerable prevents (not 100%) you from modifying the collection too and encourage to use copies with
collection.ToArray()when you actually need to change things