Podman: Remote snapshotter in podman

Created on 22 Dec 2019 · 24Comments · Source: containers/podman

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind feature

Description Remote snapshotter in podman

We are really interested in the possibility to use a remote snapshotter like the one provide in containerd also in podman.

A remote snapshotter is a piece a containerd plugin which is provided as input the name of the layer that containerd need and it either mounts the correct directory or return an error.

All this process is manage by "user-level" code/plugin.

Is something like this even possible in podman_

kinfeature

Source

siscia

👍1

Most helpful comment

I'm currently working on "additional layer store" implementation based on https://github.com/containers/storage/pull/644#issuecomment-668509243, which allows storage driver to use (possibly remotely mounted) exploded layers from that store without pulling them. This also enables the store to discover layers based on the annotation appended to layer blobs. I'll open draft PRs this week.

ktock on 24 Dec 2020

🚀1 👍1

All 24 comments

care to contribute?

baude on 22 Dec 2019

Maybe I can find the time myself or maybe we can find some resource to have somebody working on it.
It is something interesting for podman? If I had a PR ready, it would be merged?
I haven't really find the time to explore the codebase, who is responsible for this particular part of the codebase?

siscia on 22 Dec 2019

cc @ktock

AkihiroSuda on 22 Dec 2019

❤1

I somewhat suspect this world be part of the containers/storage library.

On Sun, Dec 22, 2019, 13:52 Akihiro Suda notifications@github.com wrote:

cc @ktock https://github.com/ktock

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/containers/libpod/issues/4739?email_source=notifications&email_token=AB3AOCFPON4LHIJ3PFF25JTQZ6ZNFA5CNFSM4J6JEES2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHPXN2Q#issuecomment-568293098,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AB3AOCFP33XCXPZAK73N4LTQZ6ZNFANCNFSM4J6JEESQ
.

mheon on 22 Dec 2019

Thanks a lot for opening it! I'm keen to contribute to it.

I'm currently working on the implementation of remote snapshotter plugin which enables us to mount layers without pulling the actual contents. We can plug any filesystem into it so we currently support CRFS's stargz-based filesystem and we discuss to support CernVM-FS and other filesystems.
https://github.com/ktock/remote-snapshotter

Remote snapshotter currently supports containerd's snapshotter API but I think it's not hard to support graphdriver API as well.

I'll look deep into the codebase.

@siscia How do you think about this implementation strategy?

ktock on 23 Dec 2019

I opened the discussion on https://github.com/containers/storage/issues/498 .

ktock on 23 Dec 2019

low level bits are being worked in https://github.com/giuseppe/crfs-plugin and fuse-overlayfs.

The rest of the implementation, as @mheon said, should go into containers/storage

Other remote file systems can be added in a similar way to crfs-plugin that can be used with fuse-overlayfs to lookup files from the image/lower layers

giuseppe on 23 Dec 2019

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] on 23 Jan 2020

@giuseppe Any more progress on this?

rhatdan on 24 Jan 2020

I am still trying to figure out what is the best approach for this.

I was expecting the storage to be layer-digest based, like each layer was indexed by its own hash.
Something like:

- storage
|- 123...
|- abc...
|- ....

Where 123... and abc... where the hash of the layer itself.

This does not seems to be the case.

Unfortunately I still haven't understood what hash is used to index the layer.

However this is quite a complication, a read-only remote snapshotter need to know how podman is looking for layer.

In my understanding this is managed by containers/image codebase that I am exploring now. However it is a big codebase and it is taking time.

Any feedback or help is very well appreciated.

I may be missing something in my analysis as well!

siscia on 24 Jan 2020

@siscia Are you still working on this?

rhatdan on 17 Feb 2020

Honestly we were planning to propose this as GSoC project.

Progress on this front are a little slow at the moment since the whole program didn't start yet.

Regarding this project I am busy with the bureaucracy from our side (mostly sorted out) and coming up with a good test for possible students.

siscia on 17 Feb 2020

Recently I considered about the design of it and I wrote a PoC for it. We might need changes on both of containers/image and containers/storage and I opened threads (Pull Requests) for each repo:

Higher-level part: https://github.com/containers/image/pull/956
Lower-level part: https://github.com/containers/storage/pull/644

Though they are still draft, could I get comments on it? Both of them are based on the perspective of stargz side, so I'm happy if CVMFS people give feedbacks on it.

ktock on 9 Jun 2020

there is a GSOC student working on adding CVMFS support to containers/storage.

@Mohitty could you take a look at the proposal?

@ktock we are planning on using the concept of "additional store" we have in containers/storage to emulate a remote snapshotter. Have you had a look at it?

giuseppe on 9 Jun 2020

Good to hear! Please let me know if there is anything I can help because I'm currently working on a remote snapshotter implementation (containerd/stargz-snapshotter) in containerd community.

IIUC, the additional store functionality doesn't support layer discovery? I used Driver.Create API instead because stargz uses container registries as the backing remote store and it's hard to sync all layers metadata from these registries to nodes in advance. So stargz snapshotter discovers the targeting layer from registries and dynamically mounts it for each query to a layer digest. Please tell me if I'm missing something. BTW can CVMFS sync all layer metadata from the backing remote store to nodes in advance?

ktock on 10 Jun 2020

there is a GSOC student working on adding CVMFS support to containers/storage.

@Mohitty could you take a look at the proposal?

@ktock we are planning on using the concept of "additional store" we have in containers/storage to emulate a remote snapshotter. Have you had a look at it?

Thanks @giuseppe
I'll take a look at it.

Mohitty on 10 Jun 2020

@giuseppe @siscia @Mohitty

Thanks for comments. Based on https://github.com/containers/libpod/issues/4739#issuecomment-641205950, I rethought the design of this functionality to leverage additional layer store. What I've done is adding layer discovery functionality for the store, which should be needed also for CVMFS integration. (https://github.com/containers/storage/pull/644 , https://github.com/containers/image/pull/956)

For some filesystems including stargz-based one, recognizing all available layers and storing the exhaustive list of *store.Layer in the additional store in advance is difficult. We need something like "layer discovery" functionality here, which allows clients (e.g. *storageImageSource.TryReusingBlob) to tell the store which layers they want, with some additional information (e.g. layer digest, diffID, image reference, etc). This allows the store to discover the specified layers from remote stores and to add the corresponding *store.Layer information to the list in the additional layer store. Then the later calls to the store APIs can recognize these layers.

For more details of the design, please see:

https://github.com/containers/storage/pull/644#issuecomment-645108184
https://github.com/containers/image/pull/956#issuecomment-645109589

ktock on 17 Jun 2020

@ktock @giuseppe Still working on this?

rhatdan on 10 Sep 2020

We are close to merge a PR that will allow us to create the correct file-system structure to be used by containers/storage as additionalStorage.

It is CVMFS specific work thought.

siscia on 14 Sep 2020

@siscia Do you need changes like discussed in https://github.com/containers/storage/pull/644#issuecomment-668509243 ?

ktock on 14 Sep 2020

No we don't. We just use the additionalStorage interface of containers/storage.

However, while technically we don't need it, it would arguably be a nice feature to have a discoverability for layers.

At the moment all our layers are encoded in an huge JSON, which is suboptimal, still working, but...

Also, I must be honest, it is not clear to me yet, how we would use the interface you are proposing. We would need go code inside podman? maybe we should discuss in the other issue.

siscia on 14 Sep 2020

@siscia @ktock What is the latest on this issue?

rhatdan on 24 Dec 2020

ktock on 24 Dec 2020

🚀1 👍1

Opened PRs for enabling this. https://github.com/containers/podman/pull/8837, https://github.com/containers/storage/pull/795, https://github.com/containers/image/pull/1109.

The structure will be easier to implement than the current additional image store, from filesystem implementer's perspective. This patch also enables layer discovery. So filesystems don't need to hold a large JSON blob that contains all layers available in the remote store, which will great fit with registry-based lazy pulling e.g. stargz/zstd:chunked.

ktock on 26 Dec 2020

🎉1

Was this page helpful?

0 / 5 - 0 ratings