Podman: [Feature Request] Support Podman on HashiCorp Nomad

Created on 20 Jun 2019 · 19Comments · Source: containers/podman

/kind feature

Description
HashiCorp Nomad is an orchestrator that supports a variety of container runtimes via task driver plugins. Nomad currently supports Docker, rkt, QEMU, Java task drivers.

Nomad 0.9 introduced a plugin framework that enables users to write task drivers to support any container runtime (i.e Singularity, LXC). There has been significant interest in having a Podman task driver plugin for Nomad, especially given the prevalence of RHEL users.

Podman Feature Request filed on Nomad:

https://github.com/hashicorp/nomad/issues/5312

Overview + Task Driver Plugin Framework:

Examples of community-built Nomad task driver plugins:

kinfeature stale-issue

Source

yishan-lin

Most helpful comment

I published a systemd/varlink based proof of concept to https://github.com/pascomnet/nomad-driver-podman. There is of course no release yet but you can download the binary from the linked circleci build or just compile it yourself. Featureset is very limited but it's some start, also it lacks tests so far.

towe75 on 7 Jul 2019

👍2

All 19 comments

@yishan-lin what is the expectation here on podman upstream?

baude on 21 Jun 2019

It would be hard to say without knowing the velocity of Podman upstream. Are breaking changes introduced often? How often are new features released in base Podman that would need to be brought into its plugin?

Conversely, in terms of the effort to maintain this plugin and keep it up to date with Nomad's upstream driver APIs, we see it as pretty minimal - we don't have any features in the immediate feature that would result in changes in Nomad's upstream driver API.

yishan-lin on 24 Jun 2019

We try to not to introduce breaking changes but then again, I'm not sure where exactly you would be referring to. I dont know enough about the plugins to say otherwise.

baude on 24 Jun 2019

Hi. I am playing with the nomad plugin api right now.
Though i am a bit unsure on the best approach in regard to the architecture.

My choices so far are:

nomad-plugin-podman links directly against libpod go api.
Advantages: everything is nicely encapsulated, no magic, full control and all features even if they are not exposed over varlink
Disadvantages: getting go dependencies right is relatively hard because of some common libraries in nomad and libpod ecosystems, i.e. nomad is pinned to a old version of ugorji/go, see https://github.com/hashicorp/nomad/pull/5676
Also we would depend directly on internal libpod api changes.

nomad-plugin-podman uses varlink and starts podman as sub-process.
Advantages: building should be straight forward, also podman varlink api is sufficient. No systemd integration needed.
Disadvantages: process management, podman can crash and needs to be restarted, etc.

nomad-plugin-podman uses varlink on socket activated podman.
Advantages: process management is simple, setup straight forward, can be better from security perspective as well (no need to run nomad agent as root).
Disadvantage: more impact on the system setup.

So whats your opinions, how should the integration look like?

towe75 on 27 Jun 2019

👍1

Podman varlink bridge mode supports running podman varlink if it is not configured. IE no socket activation needed. Basically the podman valink will be launched based on the CLI, and then will run for the length of the connection. This can be run in root or rootless mode.

rhatdan on 27 Jun 2019

@rhatdan , yes, i understood this already. That's what i ment with "nomad-plugin-podman uses varlink and starts podman as sub-process."

This approach would lead us to this process hierarchy:

   └── nomad-plugin-podman
               └── podman

So the plugin would control the lifecycle of a single podman (with varlink bridge mode) "slave".
Nomads plugin api, in turn, also starts the plugin as sub process.

To re-ask: this would be your favorized architecture?

towe75 on 27 Jun 2019

I believe this is what we are doing with next generation of cockpit-podman

@haraldh @baude @jwhonce WDYT?

rhatdan on 27 Jun 2019

I wouldn't be terribly worried about Go dependency versions - a lot of them
are up to date because of our recent go module migration, but previously
they were on much older versions for the most part.

On Thu, Jun 27, 2019, 15:19 Daniel J Walsh notifications@github.com wrote:

I believe this is what we are doing with next generation of cockpit-podman

@haraldh https://github.com/haraldh @baude https://github.com/baude
@jwhonce https://github.com/jwhonce WDYT?

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/containers/libpod/issues/3387?email_source=notifications&email_token=AB3AOCGK5VWXPM563IVZBF3P4UHD5A5CNFSM4HZ2KVO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYYDRNY#issuecomment-506476727,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AB3AOCHR6BLAEUT5F6KNJRLP4UHD5ANCNFSM4HZ2KVOQ
.

mheon on 27 Jun 2019

towe75 on 7 Jul 2019

👍2

I think using varlink would be the best.

jwhonce on 9 Jul 2019

Thank you for your opinions. Varlink seems to be a good fit so far. But i am sorry to say: it almost feels like having a daemon :-)

Sometimes i face some strange deadlock situations when accessing a container immediately after creating and starting it (all done in the same varlink session).
I will try to get a reproducable test to file a bug. I am pretty sure it happens when GetContainer is used but less often while inspecting a container and almost never when using a simple PS. The deadlock is only solvable by killing/restarting the systemd-podman also another interactively used podman locks up in this situation.

towe75 on 9 Jul 2019

Would be very interested to look at that if you can get us a reproducer - deadlocks are high priority to fix

mheon on 9 Jul 2019

@yishan-lin @mheon @towe75 What is the latest on this issue?

rhatdan on 5 Aug 2019

We've tracked the mentioned deadlock into c/storage. I believe @baude is still debugging.

mheon on 6 Aug 2019

Well, coming back to the actual topic of this issue: as stated, i built a varlink based prototype as POC.

Recently i spend a few hours and did the same thing without varlink, linking libpodman directly (using go 1.12, go.mod).
Although it works, development experience was rather bad. I had to dig a lot in libpod's source code to learn how things fit together. Also lack of "clickable" godoc.org reference felt strange. I understand that using podman as a library is not yet your first priority, so no offense here. Possibly a new facade layer with a simpler to use interface can improve the situation in a later version.

Overal, your varlink interface seems ATM to be the better fit in terms of effort and maintenance.
I might invest a bit more time and try to spawn a varlink podman directly from the plugin instead of poking the systemd managed socket, like mentioned above.

towe75 on 6 Aug 2019

This issue had no activity for 30 days. In the absence of activity or the "do-not-close" label, the issue will be automatically closed within 7 days.

github-actions[bot] on 3 Nov 2019

@mheon @baude What should we do with this one?

rhatdan on 3 Nov 2019

@rhatdan for sure it is not your primary goal to become fully nomad compatible. People will find this issue/thread even if it's closed and perhapts they stumble upon my POC. Also i plan to improve this plugin in my spare time, although i did not get a lot of feedback yet. A interesting experiment will be to map nomad groups to podman pods, in example.
To sumarize: i would close this issue.

towe75 on 3 Nov 2019

👍1