/kind feature
Description
HashiCorp Nomad is an orchestrator that supports a variety of container runtimes via task driver plugins. Nomad currently supports Docker, rkt, QEMU, Java task drivers.
Nomad 0.9 introduced a plugin framework that enables users to write task drivers to support any container runtime (i.e Singularity, LXC). There has been significant interest in having a Podman task driver plugin for Nomad, especially given the prevalence of RHEL users.
Podman Feature Request filed on Nomad:
Overview + Task Driver Plugin Framework:
Examples of community-built Nomad task driver plugins:
@yishan-lin what is the expectation here on podman upstream?
It would be hard to say without knowing the velocity of Podman upstream. Are breaking changes introduced often? How often are new features released in base Podman that would need to be brought into its plugin?
Conversely, in terms of the effort to maintain this plugin and keep it up to date with Nomad's upstream driver APIs, we see it as pretty minimal - we don't have any features in the immediate feature that would result in changes in Nomad's upstream driver API.
We try to not to introduce breaking changes but then again, I'm not sure where exactly you would be referring to. I dont know enough about the plugins to say otherwise.
Hi. I am playing with the nomad plugin api right now.
Though i am a bit unsure on the best approach in regard to the architecture.
My choices so far are:
nomad-plugin-podman links directly against libpod go api.
Advantages: everything is nicely encapsulated, no magic, full control and all features even if they are not exposed over varlink
Disadvantages: getting go dependencies right is relatively hard because of some common libraries in nomad and libpod ecosystems, i.e. nomad is pinned to a old version of ugorji/go, see https://github.com/hashicorp/nomad/pull/5676
Also we would depend directly on internal libpod api changes.
nomad-plugin-podman uses varlink and starts podman as sub-process.
Advantages: building should be straight forward, also podman varlink api is sufficient. No systemd integration needed.
Disadvantages: process management, podman can crash and needs to be restarted, etc.
nomad-plugin-podman uses varlink on socket activated podman.
Advantages: process management is simple, setup straight forward, can be better from security perspective as well (no need to run nomad agent as root).
Disadvantage: more impact on the system setup.
So whats your opinions, how should the integration look like?
Podman varlink bridge mode supports running podman varlink if it is not configured. IE no socket activation needed. Basically the podman valink will be launched based on the CLI, and then will run for the length of the connection. This can be run in root or rootless mode.
@rhatdan , yes, i understood this already. That's what i ment with "nomad-plugin-podman uses varlink and starts podman as sub-process."
This approach would lead us to this process hierarchy:
└── nomad-plugin-podman
└── podman
So the plugin would control the lifecycle of a single podman (with varlink bridge mode) "slave".
Nomads plugin api, in turn, also starts the plugin as sub process.
To re-ask: this would be your favorized architecture?
I believe this is what we are doing with next generation of cockpit-podman
@haraldh @baude @jwhonce WDYT?
I wouldn't be terribly worried about Go dependency versions - a lot of them
are up to date because of our recent go module migration, but previously
they were on much older versions for the most part.
On Thu, Jun 27, 2019, 15:19 Daniel J Walsh notifications@github.com wrote:
I believe this is what we are doing with next generation of cockpit-podman
@haraldh https://github.com/haraldh @baude https://github.com/baude
@jwhonce https://github.com/jwhonce WDYT?—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/containers/libpod/issues/3387?email_source=notifications&email_token=AB3AOCGK5VWXPM563IVZBF3P4UHD5A5CNFSM4HZ2KVO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYYDRNY#issuecomment-506476727,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AB3AOCHR6BLAEUT5F6KNJRLP4UHD5ANCNFSM4HZ2KVOQ
.
I published a systemd/varlink based proof of concept to https://github.com/pascomnet/nomad-driver-podman. There is of course no release yet but you can download the binary from the linked circleci build or just compile it yourself. Featureset is very limited but it's some start, also it lacks tests so far.
I think using varlink would be the best.
Thank you for your opinions. Varlink seems to be a good fit so far. But i am sorry to say: it almost feels like having a daemon :-)
Sometimes i face some strange deadlock situations when accessing a container immediately after creating and starting it (all done in the same varlink session).
I will try to get a reproducable test to file a bug. I am pretty sure it happens when GetContainer is used but less often while inspecting a container and almost never when using a simple PS. The deadlock is only solvable by killing/restarting the systemd-podman also another interactively used podman locks up in this situation.
Would be very interested to look at that if you can get us a reproducer - deadlocks are high priority to fix
@yishan-lin @mheon @towe75 What is the latest on this issue?
We've tracked the mentioned deadlock into c/storage. I believe @baude is still debugging.
Well, coming back to the actual topic of this issue: as stated, i built a varlink based prototype as POC.
Recently i spend a few hours and did the same thing without varlink, linking libpodman directly (using go 1.12, go.mod).
Although it works, development experience was rather bad. I had to dig a lot in libpod's source code to learn how things fit together. Also lack of "clickable" godoc.org reference felt strange. I understand that using podman as a library is not yet your first priority, so no offense here. Possibly a new facade layer with a simpler to use interface can improve the situation in a later version.
Overal, your varlink interface seems ATM to be the better fit in terms of effort and maintenance.
I might invest a bit more time and try to spawn a varlink podman directly from the plugin instead of poking the systemd managed socket, like mentioned above.
This issue had no activity for 30 days. In the absence of activity or the "do-not-close" label, the issue will be automatically closed within 7 days.
@mheon @baude What should we do with this one?
@rhatdan for sure it is not your primary goal to become fully nomad compatible. People will find this issue/thread even if it's closed and perhapts they stumble upon my POC. Also i plan to improve this plugin in my spare time, although i did not get a lot of feedback yet. A interesting experiment will be to map nomad groups to podman pods, in example.
To sumarize: i would close this issue.
Overal, your varlink interface seems ATM to be the better fit in terms of effort and maintenance.
This is rather ironic, and it was the same conclusion that I came to with podman-machine as well...
Most helpful comment
I published a systemd/varlink based proof of concept to https://github.com/pascomnet/nomad-driver-podman. There is of course no release yet but you can download the binary from the linked circleci build or just compile it yourself. Featureset is very limited but it's some start, also it lacks tests so far.