Mumble: Provide API for plugins

Created on 9 Aug 2019  路  35Comments  路  Source: mumble-voip/mumble

Continuation of #3711.

The idea is to remove the current API for positional audio plugins and build a new one which also supports basic client actions.

We should, in order:

  • [ ] Write functions for positional audio plugins and remove the old API (#3017).
  • [ ] Write functions for basic client actions (e.g. PTT, mute/unmute, deafen/undeafen, etc.).
  • [ ] Write example plugins.
client

Most helpful comment

Let me jump in as TFAR dev (https://github.com/michail-nikolaev/task-force-arma-3-radio)
and I can also partially speak for ACRE (https://github.com/IDI-Systems/acre2)

Me and ACRE team wanted to build a integrated mumble client together because it was too much work for either of us to do alone. We took a short look at it, found it quite complicated (as we basically had to hook into client code which there was no real API for) and didn't do anything really in the end.

Teamspeak's latest developments (especially licensing wise) and the TS5 announcement really spooked us, and we want to get away from Teamspeak as fast as possible.
We have about 450k users that we want to migrate to Mumble...

We do the positional audio ourselves so all we really need is teamspeak-like eventhandlers in the audio pipe that lets us access the already decoded audio stream (input and output).

Also something that's really needed is a way to communicate through mumble with other clients (like the teamspeak plugin message system) because we cannot reliably pipe all the data we need back through the game engine, that would have too high of a delay and adds another potential failure point.

What we additionally need:

  • Ability to see all users in current channel (name and id's)
  • Ability to get current user's id
  • Ability to move current user to different channel (via channelID or name)
  • Ability to see other channels (such that we can move the user to it)
  • Ability to force enable the current user's audio sending (PTT or voice activation override) (When he uses his radio ingame)
  • Eventhandler/Ability to check when another user in the channel starts/ends sending audio.

What would be nice to have:

  • Setting the current user's username (to give them same name as ingame, which we use to match game/plugin messages to a voice user)
  • Muting users (Don't need to transmit audio data for 200+ people in channel if the user cannot hear most of them)
  • Eventhandler for server connection status change (Just for state handling, cleanup)
  • Eventhandler for current channel change (Just for state handling)
  • Eventhandler for client join/leave on channel (Could be replaced by polling but that's just inefficient)
  • Ability to disable the current user's ability to send audio (Could alternatively be done via the audio pipeline by just muting the audio stream)
  • Ability to get the path to your own plugin file (Can do that manually via windows/linux api if needed)
  • Ability to install other files along with the plugin (We store radio audio files in teamspeak folder, can easily be integrated into the dll though)
  • Being able to call API functions from any thread, and even inside the audio pipeline (I know it's bad practice, but we do that quite a bit currently)

TFAR's Teamspeak API is here:
The Misc management stuff: https://github.com/michail-nikolaev/task-force-arma-3-radio/blob/master/ts/src/Teamspeak.cpp#L116
The Audio processing https://github.com/michail-nikolaev/task-force-arma-3-radio/blob/master/ts/src/plugin.cpp#L422

I already worked quite a bit on extracting much of the Teamspeak API into a middle layer that can easily replaced with mumble support.

Ofc there are many other things that plugins could/would make use of.
Though I'd say we are probably two of the most used Teamspeak Plugins out there.

All 35 comments

I think we should consider using an API (very) similar to the TeamSpeak plugin API: https://github.com/TeamSpeak-Systems/ts3client-pluginsdk/blob/master/src/plugin.h

This way it would be easier for plugin devs to maintain a version of their plugin for both systems. I think this would boost the availability of plugins for mumble. Plus a lot of thinking has been put into the TS API (I assume). So why reinvent the wheel?
My first idea even was to use the very same method names but then I discovered that they actually have "ts3plugin" as a name Prefix in their functions which we definitely don't want...
This approach might also be worth it for the positional audio system? I don't know the depths to which TS supplies API functionality for this nor to which depth mumble needs it. But if this was also possible i think it would create an overall consistent API.

Another thing I was thinking about was whether it might be worth it to differentiate between synchronous and asynchrous plugins. The difference would be that for synchronous plugins the callback code would be executed in the same thread as the application and thus delaying its further execution. The callbacks for asynchrous plugins on the other hand would be executed in a separate thread which would allow the application to continue normally in its own thread.

Yet another thing to think about is whether we only want client plugins or whether we also want to provide a server plugin API (off of my head I can't come up with neat examples of what this API might be useful for but I'm sure that use cases exist.)

Let me jump in as TFAR dev (https://github.com/michail-nikolaev/task-force-arma-3-radio)
and I can also partially speak for ACRE (https://github.com/IDI-Systems/acre2)

Me and ACRE team wanted to build a integrated mumble client together because it was too much work for either of us to do alone. We took a short look at it, found it quite complicated (as we basically had to hook into client code which there was no real API for) and didn't do anything really in the end.

Teamspeak's latest developments (especially licensing wise) and the TS5 announcement really spooked us, and we want to get away from Teamspeak as fast as possible.
We have about 450k users that we want to migrate to Mumble...

We do the positional audio ourselves so all we really need is teamspeak-like eventhandlers in the audio pipe that lets us access the already decoded audio stream (input and output).

Also something that's really needed is a way to communicate through mumble with other clients (like the teamspeak plugin message system) because we cannot reliably pipe all the data we need back through the game engine, that would have too high of a delay and adds another potential failure point.

What we additionally need:

  • Ability to see all users in current channel (name and id's)
  • Ability to get current user's id
  • Ability to move current user to different channel (via channelID or name)
  • Ability to see other channels (such that we can move the user to it)
  • Ability to force enable the current user's audio sending (PTT or voice activation override) (When he uses his radio ingame)
  • Eventhandler/Ability to check when another user in the channel starts/ends sending audio.

What would be nice to have:

  • Setting the current user's username (to give them same name as ingame, which we use to match game/plugin messages to a voice user)
  • Muting users (Don't need to transmit audio data for 200+ people in channel if the user cannot hear most of them)
  • Eventhandler for server connection status change (Just for state handling, cleanup)
  • Eventhandler for current channel change (Just for state handling)
  • Eventhandler for client join/leave on channel (Could be replaced by polling but that's just inefficient)
  • Ability to disable the current user's ability to send audio (Could alternatively be done via the audio pipeline by just muting the audio stream)
  • Ability to get the path to your own plugin file (Can do that manually via windows/linux api if needed)
  • Ability to install other files along with the plugin (We store radio audio files in teamspeak folder, can easily be integrated into the dll though)
  • Being able to call API functions from any thread, and even inside the audio pipeline (I know it's bad practice, but we do that quite a bit currently)

TFAR's Teamspeak API is here:
The Misc management stuff: https://github.com/michail-nikolaev/task-force-arma-3-radio/blob/master/ts/src/Teamspeak.cpp#L116
The Audio processing https://github.com/michail-nikolaev/task-force-arma-3-radio/blob/master/ts/src/plugin.cpp#L422

I already worked quite a bit on extracting much of the Teamspeak API into a middle layer that can easily replaced with mumble support.

Ofc there are many other things that plugins could/would make use of.
Though I'd say we are probably two of the most used Teamspeak Plugins out there.

Another thing that I believe should be part of the plugin system is some sort of installable file that can be opened via Mumble in order to install the contained plugin (and possible attachments) to the proper location.

There's another thing I was thinking about: Because of internal implementation details I think it is rather unlikely that we'll be able to provide the same API as TeamSpeak does (especially on which parameters are passed down through the functions).
Therefore I thought that it might be better to focus on providing the functionality of the TS API without trying to mimic it exactly because we'll probably have to deviate from the TS API (slightly) anyways. Thus it might be more confusing if you are working with an API that seems to be the same as the TS one but then behaves slightly different from it from times to times.
I could imagine this to be a big cause of hard-to-find bugs that might frustrate possible plugin devs.

Instead i was thinking that we could provide a "compatibility plugin" that tries to map the plugin calls from Mumble to TS _exactly_ which might require some overhead due to parameter (type) conversions.
Like this one could use the actual TS API (maybe it might be enough to actually use the unchanged TS plugin) when it doesn't matter that there might be some (slight) performance penaltes due to those conversions. If one however decides to use the Mumble API directly one could avoid these penalties altogether.

Plus: There shouldn't be the kind of confusion I mentioned in the beginning because one actually deals with two distinct APIs.

I'd appreciate some thoughts and comments on this one (thumbs up or down on this comment would be sufficient for a start although more explanatory comments will of course be more helpful :) )

Instead i was thinking that we could provide a "compatibility plugin" that tries to map the plugin calls from Mumble to TS exactly

I really think that's not necessary or too much effort. Porting a plugin to a different API shouldn't be much work.

ACRE founder here. As long as we have the ability to effect the original input PCM sound data per user, effect the post output channel expanded sound data per user (aka we have access to the individual streams for a stereo, quadraphonic, 5.1, 7.1, whatever output), and capture events like when people start and stop speaking then ACRE can pretty much get by with that.

We're not entirely concerned with other features like moving users or other things. Those tend to be niceties that at least in my experience with our player base not been widely used.

The list of things that @dedmen wrote is generally what we'd want ideally but we can get away with less.

Some sort of generic network transport layer would also be nice, but we've had plans for a while to move away from TS's network transport layer for generic data and on to something of our own, so its not the biggest deal.

If you'd like to discuss this more we have a slack for ACRE over at our project: https://github.com/IDI-Systems/acre2

To add on to @NouberNou as an ACRE developer: We have no wish for plugin API to be the same or compatible with TS. TS API has many shortcomings, especially with backwards compatibility which both ACRE and TFAR projects experienced.

My suggestion is to make it from the ground-up, look at TS API objectively, improves on it's cons, follow the pros. We have no problem with a different API if it's better!

Thanks for the further input. I guess I'll try my best at creating a custom API then.

In order for me to be able to improve on some cons of the TS API I'd ask you to comment on what the TS API does well and what its shortcomings are as I never actually used the TS API. Thus I'm kinda dependent on your input for this :)

On another note: Would you guys prefer to create a Discord/Slack for further discussions? Or is this GitHub issue fine?

If you have questions that are immediate maybe something other than a github ticket?

#mumble on freenode perhaps?

Problem with the IRC is that everyone has to be online at once...

EDIT: we're on the ACRE Slack for now. Any comments and thoughts brought in via GitHub (either comment on this issue or on the respective PR) are appreciated nonetheless though :)

How extensive should we make use of typedefs in the API?
Currently we might have an ID to be e.g. an unsigned integer but maybe in the future this might get changed to a signed one. If we'd use typedefs we could simply change that definition and all that has to be done is recompile the plugin.
In the other hand using typedefs extensively obfuscates the API a bit.

Currently I'd vote to use typedefs for all parameters that will be passed around between the API functions and which the user will probably not directly interact with. In this category I'd put IDs and similar things whereas something like a channel/client name should use default data types without any additional typedefs to ease interaction with them inside the plugin.

What do you think about this?

How extensive should we make use of typedefs in the API?

don't see any real reason not too. We certainly don't want typedefs for things like const char*, not for things that won't be changed.
I see it more as a readability tool, int might be userid,channelid,someid,whatever,stringlength,number of cookies.
but a userId type can only be a specific thing, even if it's just an int.
Makes the api easier and clearer.

Agreed. Typedefs make changes in the API also easier to work with. We as users do not have to be concerned with the underlying types.

@davidebeatrici In which way do you think we should add the existing API into the new one?
I thought of adding the tryLock, unlock and the fetch function in the same way they exist in the current API except that they wouldn't be part of a struct anymore (so they'll be normal, exported functions inside the library (plugin)).

Another way I was thinking about was to provide something like a "PositionalAudioAgent" which will be a struct containing the respective function pointers (much like it currently is). If a plugin wants to provide support for positional audio it can simply implement a function getPositionalAudioAgent that returns the respective struct which is then used as the current "plugins" are.

@Krzmbrzl The API you're working on has init() and shutdown(), there's no need for tryLock() and unlock().

I would simply add a callback function such as getPositionalData(), as a replacement for fetch().

We should probably add a function that returns the plugin type, determined by one or multiple flags, for example:

enum {
    TYPE_POSITIONAL = 1, // Provides positional data
    TYPE_AUDIO = 2       // Changes input or output audio
};

uint8_t getPluginType() {
    return TYPE_POSITIONAL | TYPE_AUDIO;
}

The API you're working on has init() and shutdown(), there's no need for tryLock() and unlock().

I was thinking that a plugin might provide some extra features besides positional information which you might want to use although you don't want it to provide the positional audio. In these cases it would be impractical to use the init and shutdown functions for initializing the positional audio, wouldn't it?

We should probably add a function that returns the plugin type, determined by one or multiple flags

Yeah I do like that - we could even add a possibility to request certain feature sets from the plugin via the UI (e.g. a checkbox for positional information, etc. And then a callback that tells the plugin which features are wished).

I was thinking that a plugin might provide some extra features besides positional information which you might want to use although you don't want it to provide the positional audio. In these cases it would be impractical to use the init and shutdown functions for initializing the positional audio, wouldn't it?

In that case it would be better to implement functions to get and set for the plugin's settings.
Ideally the settings should be stored in a structure defining the data type for the value and the description.

Yeah I do like that - we could even add a possibility to request certain feature sets from the plugin via the UI (e.g. a checkbox for positional information, etc. And then a callback that tells the plugin which features are wished).

Definitely.

In that case it would be better to implement functions to get and set for the plugin's settings.
Ideally the settings should be stored in a structure defining the data type for the value and the description.

Plugin settings would of course be nice, but I'd say that' something for future improvements. I don't quite see how this solves the problem of (not) needing the lock-methods... Did you have in mind that a plugin shall "lock" in the init and unlock if the positional setting is deactivated?
Whether that is possible/feasible seems to depend on what that "locking" actually is - Is this just searching for the game's executable and keeping a reference to it? Or is there some actual locking (e.g. multi-threading) involved?

It's a very simple lock mechanism, not for multi-threading: there's a loop that calls tryLock() for each plugin until true is returned:
https://github.com/mumble-voip/mumble/blob/5f744ed9710452364f7681437be8bf2ae190c41b/src/mumble/Plugins.cpp#L289-L339

The linked code actually doesn't contain tryLock() :P

there's a loop that calls tryLock() for each plugin until true is returned

So essentially every plugin is asked whether it is currently able to provide positional audio and the first one that answers with "yes" (via tryLock()) will be used. Is that correct?
If so I think it would still make sense to add such a function to the new API (though I would name it differently) as such an "answer" is not really what I intended the init() function to return. If the init returns anything but STATUS_OK, I'd unload the plugin assuming that something went wrong and it's no longer loadable.

The linked code actually doesn't contain tryLock() :P

The function is not guaranteed to be called tryLock(), it's defined by the plugin and then its pointer is passed to Mumble.

So essentially every plugin is asked whether it is currently able to provide positional audio and the first one that answers with "yes" (via tryLock()) will be used. Is that correct?

Correct.

If so I think it would still make sense to add such a function to the new API (though I would name it differently) as such an "answer" is not really what I intended the init() function to return. If the init returns anything but STATUS_OK, I'd unload the plugin assuming that something went wrong and it's no longer loadable.

Well... we could add a value such as STATUS_NOT_READY, indicating that the plugin cannot operate in that moment.

we could add a value such as STATUS_NOT_READY, indicating that the plugin cannot operate in that moment.

How will we know when the plugin is finally ready? The init() function is meant for the plugin to initialize e.g. its required memory and things like that. So we can't call init() again and again because we don't want it to allocate memory all the time. Plus we'd also create a memory leak as the plugin dev should be able to rely on cleaning the memory in shutdown(). So we'd have to call init(); shutdown(); init(); shutdown(); ... in order to prevent that...

I believe that it is much cleaner if we just add a function like initPositionalAudio() and shutdownPositionalAudio() as a replacement for tryLock() and unlock(). This init function can then return whether the plugin is ready for delivering positional audio or not. This would also allow to handle the positional audio without messing around with init and shutdown of the plugin as a whole.

I intend to make almost all functions in Plugin.h optional to implement. That way a dev doesn't care about implementing functions he/she doesn't need. Thus having those extra functions shouldn't really hurt anyone...

That's what I'd settle on if you don't have any objections @davidebeatrici
https://github.com/mumble-voip/mumble/blob/be7253f796cee948d2a1accaa6e61f10429cc339/src/mumble/pluginAPI/Plugin.h#L110-L154

How will we know when the plugin is finally ready? The init() function is meant for the plugin to initialize e.g. its required memory and things like that. So we can't call init() again and again because we don't want it to allocate memory all the time. Plus we'd also create a memory leak as the plugin dev should be able to rely on cleaning the memory in shutdown(). So we'd have to call init(); shutdown(); init(); shutdown(); ... in order to prevent that...

Shouldn't the plugin take care of freeing the memory in case it fails to initialize?

I believe that it is much cleaner if we just add a function like initPositionalAudio() and shutdownPositionalAudio() as a replacement for tryLock() and unlock(). This init function can then return whether the plugin is ready for delivering positional audio or not. This would also allow to handle the positional audio without messing around with init and shutdown of the plugin as a whole.

We could do that, yes.

I intend to make almost all functions in Plugin.h optional to implement. That way a dev doesn't care about implementing functions he/she doesn't need. Thus having those extra functions shouldn't really hurt anyone...

Indeed.

That's what I'd settle on if you don't have any objections @davidebeatrici

Seems excellent to me. I would change initPositionalData() so that it returns a uint8_t, that way the failure type (e.g. temporary when the process is not running, permanent when the memory offsets are outdated) can be specified and Mumble can act accordingly (e.g. disable the plugin in case of permanent failure).

I would also remove the third argument (programCount) and make programNames a NULL-terminated array.

programPIDs should be an array of integers, probably uint32_t or uint64_t.

Shouldn't the plugin take care of freeing the memory in case it fails to initialize?

Yes, absolutely. But if a plugin provides positional data _and_ some other feature, it didn't really _fail_ initializing just because it can't provide positional data at the moment.

I would change initPositionalData() so that it returns a uint8_t

Will do

I would also remove the third argument (programCount) and make programNames a NULL-terminated array.

I made a decision against NULL-terminated arrays in the plugin API (as it can cause ambiguity in some cases) and thus I'd stick to that in order to provide a uniform strategy on arrays in the API.

programPIDs should be an array of integers, probably uint32_t or uint64_t.

Obviously - didn't really think when writing that :man_facepalming:

Yes, absolutely. But if a plugin provides positional data and some other feature, it didn't really fail initializing just because it can't provide positional data at the moment.

Right.

I made a decision against NULL-terminated arrays in the plugin API (as it can cause ambiguity in some cases) and thus I'd stick to that in order to provide a uniform strategy on arrays in the API.

No problem.

@davidebeatrici do you have a good idea on how to set up unit-tests for the plugin-framework? Does Mumble even use unit-testing?

I think the most challenging part would be that the API requires a running version of Mumble that could fake connections, channels, users, etc. Do you have such a fake-Client available somewhere (used by e.g. other tests)?

We have simple unit tests in https://github.com/mumble-voip/mumble/tree/master/src/tests.

Unfortunately we don't have one that tests the client or server itself.

Is there a quick way of testing whether the existing positional data plugins still work with the new implementation? (preferably without having to launch the respective game - some sort of test-case maybe?)

Unfortunately not. However, we only changed our API, not the plugins themselves.

Testing a plugin that supports all functions (position, context and identity) is probably enough.

Do you think you could either go ahead and test one with the new implementation or find a few people that could?
My problem is that from the list of supported games I only own Arma 3 for which I assume there is no official plugin. Plus I don't know how the plugin sounded before so I won't be able to detect whether my changes affect the positional audio calculations :)

Before someone starts testing I'll need some more time verifying that at least from my point of view I can verify that the plugins work as intended. Don't want to make anyone spend time on this while there might be some technical issues like the functions not getting called in the first place xD

We use https://github.com/mumble-voip/mumble-pahelper to test plugins, but we have to implement the new API in it as well.

@davidebeatrici I am currently facing the implementation of a plugin moving a user into a password-protected channel. If the user has the correct access-token, then this of course is no problem. In the other case though, I'm not quite sure how I should handle this.

The respective API call has to provide a password of course and essentially I think I have 2 options with this:

  1. Add that password to the user's access-tokens and send that to the server, before asking the server to move the client
  2. Add temporary tokens to the UserState protobuf message and send the password as sort of a one-time token alongside the move-request

The problems I see with 1. are

  • The password is then permanently added to the user's tokens, which might not be desirable
  • In some cases it might happen, that the subsequent move-request reaches the server, before the tokens have been updated which will result in the move-request being rejected

Therefore I'd prefer option 2. Furthermore I think that those one-time-passwords (that are not added to the user's token-list permanently) might come into use in different contexts as well.

Before implementing this though, I'd like to hear your opinion on this :)

I agree, option 2 is definitely better.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ghost picture ghost  路  4Comments

xaro1 picture xaro1  路  3Comments

rkachach picture rkachach  路  3Comments

Jeselnik picture Jeselnik  路  4Comments

TerryGeng picture TerryGeng  路  5Comments