Entt: struct Family bug with runtime shared library

Created on 10 Jul 2018  路  22Comments  路  Source: skypjack/entt

Hello, it's me again, I know I never bring good news but I find a case that annoys me with the family structure:

capture d ecran 2018-07-10 a 08 31 00

i got a static generated two times from family:

template <typename...>
    class Family
    {
        static std::atomic<std::size_t> identifier;

        template <typename...>
        static std::size_t family() ENTT_NOEXCEPT
        {
            static const std::size_t value = identifier.fetch_add(1);
            return value;
        }

    public:
        /*! @brief Unsigned integer type. */
        using family_type = std::size_t;

        /**
         * @brief Returns an unique identifier for the given type.
         * @return Statically generated unique identifier for the given type.
         */
        template <typename... Type>
        inline static family_type type() ENTT_NOEXCEPT
        {
            return family<std::decay_t<Type>...>();
        }
    };

    template<typename... Types>
    std::atomic<std::size_t> Family<Types...>::identifier{};

When you will use the dispatcher or the registry whatever that use family, it's will use two times the family and take the wrong identifier, call the wrong callback and most of the time crash:

capture d ecran 2018-07-10 a 09 10 17

capture d ecran 2018-07-10 a 09 09 24

capture d ecran 2018-07-10 a 09 09 43

another approach for users who use plugins would be great, so many things in EnTT are thought for the plugins to work, unfortunately this part seems problematic

i got an approach that work through plugins and it's suck but let me show (mixing static reflection and hashing + a vector) (bad performance etc, but it's an approach) :

const auto fake_type = std::hash<std::string>()(Event::class_name());
        auto it = std::find_if(begin(id_memoisation ), end(id_memoisation ), [fake_type](size_t id) {return fake_type == id;});
        long type = 0;
        if (it == id_memoisation.end()) {
            id_memoisation .push_back(fake_type);
            type = std::distance(id_memoisation .begin(), std::prev(id_memoisation.end()));
        } else {
            type = std::distance(id_memoisation .begin(), it);
        }
        std::cout << Event::class_name() << ": " << type << " fake_type: " << fake_type << std::endl;

        if(!(type < wrappers.size())) {
            wrappers.resize(type + 1);
        }

        if(!wrappers[type]) {
            wrappers[type] = std::make_unique<SignalWrapper<Event>>();
        }

        return static_cast<SignalWrapper<Event> &>(*wrappers[type]);
    }

this snippet fix the problem and using the idx position of the memoisation vector, it's suck but work's.

honestly I do not think that with the solution to use a static currently we will manage to solve this problem.

i think we can mark it as bug.

Can we speak about some options that doesn't ruin performance, and are acceptable for everyone ?

A solution to keep the static things would be something like: "please when you compile a shared library don't generate the static you will have the symbol latter".

But i'm not sure it's possible honnestly

discussion

Most helpful comment

objcopy -N "symbol" mylib.so

All 22 comments

I think this was exactly the problem with #107 and we agreed on the fact that it won't work with several shared libraries. We can manage to make it work somehow with an executable and a shared library (the README contains some notes for that), but there is no chance it will work with more than one shared library.
Just to be clear, the approach you're suggesting ruins completely the performance for those not interested in using shared libraries (most of the users probably) and cannot be considered a viable solution.

another approach for users who use plugins would be great, so many things in EnTT are thought for the plugins to work, unfortunately this part seems problematic

Almost true. I designed some parts of the library so as to be able to write plugins through a scripting language. I never considered the option of multiple shared libraries actually and I'm pretty astonished by the fact that it works to some extent also in this case. :smile:
This is how static stuff works unfortunately. They don't fit well with shared libraries, this is a fact.

Two possible approaches are probably:

  • Allow to set custom family generator, so that users of shared libraries can work around the limitations imposed by the static solution.
  • Decouple completely your shared library from EnTT: cross boundaries through interfaces, design everything on an IoC approach and get rid of EnTT within the shared libraries. This way you can have a finer control over the generation of identifiers.

Honestly, if I was you, I'd go with the second suggestion. There are quite a lot of things that won't work in EnTT if used from different shared libraries: the registry, the configuration system, the dispatcher, the emitter and who knows what else.

@skypjack So use an interface that would use EnTT through the main binary? it seems right to me

So use an interface that would use EnTT through the main binary? it seems right to me

Well, as long as you deal with static stuff ever on the same side of the boundary, identifiers are consistent. You observed it in your case, right? Problems arise when you cross the boundary and start using static stuff sometimes on one side and sometimes on the other side. In this case, things just break out as expected. :smile:

Do you think you can try the solution based on interfaces to decouple libraries from EnTT?


Another possible approach (much more disruptive actually, but maybe interesting for someone) could be to define a sort of _mixed registry_ that allows for both compile-time and runtime components. In this case, as long as you define as compile-time components those types that are used while crossing a boundary, everything should work just fine.
An example should clarify this:

entt::Registry<std::uint64_t, ComponentA, ComponentB, ComponentC> registry;
auto entity = registry.create();
registry.assign<ComponentA>(entity); // already known at compile-time, identifiers are fixed
registry.assign<AnotherComponent>(entity); // unknown component, identifiers are generated at runtime

It increases compilation time probably, but it could solve the problem with shared libraries probably. Not sure about it actually.

As a few notes:

  • It will take quite a while to develop it because it requires a lot of changes.
  • This isn't a priority for me, so I'll give precedence to other things if required (because of this, hardly I can set a deadline).
  • It requires to sacrifice tags maybe (and it is something fine for me, but users of the library will complain probably).

I Will try the first approach and let you know if it's work for me , i will let the issue open meanwhile this time, and tell you where i'am !

The first approach seems complicated, because it would be necessary to be able to use virtual, which does not work with templates

for example:

 class interface_dispatcher
    {
    public:
        virtual void trigger() noexcept  = 0; // Cannot use template here for type and variadic arguments
        virtual ~interface_dispatcher() = default;
    };

:thinking: Granted that you can't make virtual a function template. However, I didn't get why this is a problem in your case.
Within your shared library you know what are the events you're going to trigger, right? You can just define an interface for them and export it, then implement the interface on the other side of the boundary and forward the requests to the actual dispatcher that will be probably a member of the concrete class. Your implementation doesn't require to have a virtual function template. A member function for each event you want to trigger is fine. Of course, you can't go with a _generic_ approach, if that was what you were asking/trying.
This way, the dispatcher lives only on one side of the boundary and identifiers are consistent throughout the whole thing.

Another possible approach is by using the delegate class. You can wrap the queue member function of a dispatcher that you created on one side of the boundary and provide the library itself with a delegate to use to trigger events.

As shown, there are several approaches you can try to work around the existence of the static stuff. To sum up what we mentioned so far:

  • Mixed (compile-time + runtime) registry: pretty much a lot of work to do on EnTT, I'm tempted to put it in the backlog, but I can't set any schedule for it's not a priority for me.
  • Interfaces: see above, just a way to decouple the shared library and EnTT, so that the latter lives only on one side of the boundary.
  • Delegates: a more rude and straightforward approach to do what I mentioned in the previous bullet.
  • Allow users to pass custom family generators to the registry: it will slow down compilation time and complicate a bit the library, on the other side you get the maximum freedom on the topic. Again, it hasn't high priority for me.
  • Rewrite the family so as to avoid static stuff: it ruins definitely the performance, unacceptable for users that aren't interested in using multiple shared libraries at once. I'd ignore this point honestly.

So, bullets 2 and 3 are probably something you can try your own. All the other solutions require to modify to some extents EnTT and sacrifice something along the path, so we should avoid them unless strictly necessary. Moreover, as I mentioned, they aren't features with high priority for me, so I cannot set any schedule and I'm not that sure it works for you. Just to be clear, in the past I've been paid to give higher priority to projects and features that weren't a must have for me, but I don't think this is the case. Right? :smile:

And another solution, remove the static symbol generated in the library so it's will use the one from the main binary only 馃槀馃槀馃槀馃槀馃槀馃槀

I didn't even know it was possible, actually. Is it? How could you do that?

objcopy -N "symbol" mylib.so

Yeah, I just realized it does exactly what you said. The resulting libraries/executable work as expected now?

i didn't try it yet, i will try it latter and let you know, and i need to find something similar with MSVC / OSX

If you find a meaningful way to do that that is also reproducible, we could document it and add a note to the README file. Probably someone else will enjoy your work, sooner or later. :smile:

It's possible to remove the final keyword from the dispatcher ? if we want to add some functionnality and put in protected: the SignalWrapper and BaseSignalWrapper?

class dispatcher : public ::entt::Dispatcher
    {
    public:
        template <typename Event, typename... Args>
        inline void trigger(Args &&... args)
        {
            wrapper_<Event>().trigger(std::forward<Args>(args)...);
        }

        template <typename Event>
        inline sink_type <Event> sink() noexcept
        {
            return wrapper_<Event>().sink();
        }

    private:
        template <typename Event>
        SignalWrapper <Event> &wrapper_()
        {
            static_assert(shiva::refl::has_reflectible_class_name_v<Event>, "Event should have reflectible class name");
            auto name = Event::class_name();
            if (!wrappers_.count(name)) {
                wrappers_.emplace(name, std::make_unique<SignalWrapper<Event>>());
            }
            return static_cast<SignalWrapper <Event> &>(*wrappers_[name]);
        }
        std::unordered_map<std::string, std::unique_ptr<BaseSignalWrapper>> wrappers_;
    };

So i can do something like this, and i'm happy
It's perfeclty work as expected, and i think it's ok if user from entt can inherit from your class for specifying special things !

Well, consider this:

class Dispatcher final {
    using event_family = Family<struct InternalDispatcherEventFamily>;

    template<typename Class, typename Event>
    using instance_type = typename SigH<void(const Event &)>::template instance_type<Class>;

    struct BaseSignalWrapper {
        // ...
    };

    template<typename Event>
    struct SignalWrapper final: BaseSignalWrapper {
        // ...
    };

    template<typename Event>
    SignalWrapper<Event> & wrapper();

public:
    template<typename Event>
    inline sink_type<Event> sink() ENTT_NOEXCEPT;

    template<typename Event, typename... Args>
    inline void trigger(Args &&... args);

    template<typename Event, typename... Args>
    inline void enqueue(Args &&... args);

    template<typename Event>
    inline void update();

    inline void update() const;

private:
    std::vector<std::unique_ptr<BaseSignalWrapper>> wrappers;
};

This is what the dispatcher looks like right now. With your changes:

  • You are hiding sink and trigger.
  • You can't use enqueue and update (both of them), because they don't have access to your wrappers_ data member.
  • wrappers data member occupies memory for nothing, because you aren't going to use it in any way.

It means that the whole API of the dispatcher is useless and so are its data. The one of the wrapper is almost useless as well, because all what you're using is trigger and sink (half of the member functions).
In other terms, you are writing something completely different and all what you need is part of the interface of the wrapper. On the other side, because you're hiding and not overriding, the final solution is quite error-prone and risky to use actually.

My two cents: don't go with the dispatcher offered by EnTT, you don't want it, you're just stuck on it because _it's there_. Write your own dispatcher and use it if you decided this is the way you want to work around the limitation imposed by static stuff. No class in EnTT has a dependency on the dispatcher, so you won't incur in any problem.

That being said, pay attention to the fact that an std::unordered_map does allocate internally whether required. Your solution suffers from the same problem you already faced a while ago, that is the risk of allocations/deallocations across boundaries. This is even more subtle and quite spurious.
As I said at the time you mentioned it, approaches to work around the problem exist and are more or less the same posted in the previous comments. :smile:

i decided to drop whole plugin's thing since it's a pain to use in c++, i have other problem too, like running the engine in webassembly, with dll seem's to be complicated i will stay with Scripting and Header only Systems. C++ 17, and still horrible to make plugins in this language, meanwhile c# and java perfectly do the job on this side, and it's not painfull at all

Yeah, pre-compiled shared libraries don't fit well if you don't have the sources actually.
Well, let me know what you want to do for this issue and what are your progresses, if any.
If you think you won't follow this way anymore, I'll close the issue in a few days otherwise.

Hi guys, I've been following this thread and believe I've run into the same problem. I couldn't quite follow the whole reasoning, but forgetting about performance, if I remove the inline and static keywords from identifier in family.hpp:
template class family {
inline static std::atomic identifier;

would this fix things? albeit with a performance hit?

@digitalmonkey The solution to using EnTT across shared library boundaries is being discussed in #186. The C++ standard basically says that the linker should select one inline variable and share it. MacOS and Linux do this so the registry can be used normally across shared libraries on MacOS and Linux but not Windows. On Windows, we have to use a different strategy (discussed in #186). We have to use the names of the components to give them a unique identifier. There's a macro to help with this so to declare a component that can be shared across boundaries you can do this:

ENTT_SHARED_STRUCT(my_component) {
  // body of struct
}

There's an implementation of this on the experimental branch but it's still a work-in-progress.

@Kerndog73 maybe @digitalmonkey isn't all the ways wrong. If families were data members of a registry, what he's saying could work. You cannot generate the same identifiers from other classes without a registry, but I'm not doing it anyway.

Because I've just waken up and I'm pretty sure I'm missing something, feel free to _pwn_ me with the point I'm missing. :smile:

@skypjack I鈥檝e thought about that ever since I first saw the family class. It鈥檚 not possible. It would be awesome if it were possible but it just isn鈥檛. Below is my attempt at proving it is impossible 馃槃.

You would need a member function template of a class that returns the identifier of a type. The ID returned by calling this member function template with type T depends on whether the function has been called before. On the first call, the function will increment and return a counter. On subsequent calls, the function will not increment the counter. The function will instead return what it previously returned.

The behavior of the function depends on whether it has been called before. The function also needs to know what it previously returned. You could store this information in some kind of data structure but how do you access the data? You need an identifier for the type to know which element in a data structure to access to generate an identifier for a type. 馃

A static variable in a function stores the information along with the function itself. So each instantiation of the function has an associated identifier and boolean. The function checks the boolean to determine whether to increment the counter or return the stored identifier.

int counter = 0;

template <typename T>
int identifier() {
  static const id = counter++;
  return id;
}

// instantiation with T = double
int identifier_double_id;
bool identifier_double_id_init = false;
int identifier_double() {
  if (!identifier_double_id_init) { // performed atomically
    identifier_double_id_init = true;
    identifier_double_id = counter++;
  }
  return identifier_double_id;
}

The function knows where to find the data because the address is fixed. If the address was not fixed, the function would need a type identifier to calculate the address of the data. You end up needing type identifiers to generate type identifiers. I wish this was possible but I just don't see a way.

This fails across shared library boundaries on Windows because of the counter. The DLL and EXE do not share a counter. There are two separate counters and two separate instantiations for each type so it just falls apart.

Thanks for the information so far. What I find puzzling is that you state the problem
Is mainly with Windows. However, we are seeing this problem a non-Windows as well - I鈥檓 coding on one of those new Magic Leap One ar devices, using the clang compiler. The wiki has visibility directives we can set for Gcc compilers. Does this not work on clang? I can confirm when I try access components in my registry from within my shared library, it works fine. Calling the same code from outside returns zero components, even though the registry pointer shows the same address in both places. We also tried avoid exposing the entt header to the main code, using forward declarations but it didn鈥檛 seem to help. I tried the visibility directives in our main code and shared library headers but also no change. Wouldn鈥檛 he solutions discussed in thread #186 fix things? They seem mainly to solve the issue on Windows.

@digitalmonkey

What I find puzzling is that you state the problem is mainly with Windows.

I apologize. My fault. In fact, the problem can easily be reproduced also on the other systems and with the other major compilers (eg gcc/clang) by setting the default visibility for symbols to hidden.
I confirm that #186 will solve the problem all the ways, as long as users declare properly components (eg using ENTT_SHARED_STRUCT).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Milerius picture Milerius  路  5Comments

skypjack picture skypjack  路  6Comments

Deins picture Deins  路  6Comments

dBagrat picture dBagrat  路  4Comments

bjadamson picture bjadamson  路  4Comments