Cache: Alternative storage backend

Created on 21 Jun 2020  路  10Comments  路  Source: actions/cache

The current storage for cache is Github, this does not work well with self hosted runner or some security sensitive case. Two issues impose here, security, and for self hosted runners there鈥檚 always options to store it much faster (locally, blob storage), rather than uploading/downloading gigabytes of files from Github.

feature request

Most helpful comment

If we could use self-managed disks, that would be awesome! For our developers, we want the CI CD experience to be the same both in the cloud and self-hosted. Right now, caching would need our own solution on self-hosted.

All 10 comments

This would help our self-hosted runners that are on prem a lot. Maybe we can create a plugin kind of system for different type of storage systems.

In my opinion, this certainly makes sense. It also has the advantage that you don't have to worry about GitHub's cache-size limits. The difficulty of doing this is not so high, so I can work on this if necessary. What do you think? @dhadka Well, I understand that there are more higher priority issues at the moment, so I would like to complete them first.

@zen0wu There is a workaround made by @shonansurvivors that currently only work with S3 and its compatible services and systems.
https://github.com/shonansurvivors/actions-s3-cache

But I don't think that action will make things much faster than this action. It has an advantage over this action, probably only to avoid cache-size limits.

@smorimoto Yeah, I was thinking of working on a generic "bring your own storage" Action during an upcoming company hackathon. My idea was to have plugins for the various storage providers (Azure, S3, Minio, etc.) with some parameters to let you control the lifetime of the file. For example, if we had a time-to-live (TTL) parameter, we could essentially implement caching with a short TTL and the upload-artifact and download-artifact actions with a longer TTL. A scheduled job could then be setup to run daily or weekly to scan for and remove old / unused files. This would give users much more control over the content and also eliminate many of the current restrictions (size limits, sharing between branches and repos, etc.)

@dhadka Oh! I think that's a pretty good idea. Everyone probably thought about bringing the concept of the plugin to Actions, but no one has done that yet. It will have a good effect on the community that these famous Actions do it.

A similar concern was brought up before: https://github.com/actions/cache/issues/279

Would love to see either option done (local or a general repository-style API for setting your own cache backends) - storing cache on GitHub is convenient but the limitations can be a deal-breaker if you're trying to use the cache as a way to introduce persistent directories between action runs, particularly when there might be 7 days between cache pulls.

If we could use self-managed disks, that would be awesome! For our developers, we want the CI CD experience to be the same both in the cloud and self-hosted. Right now, caching would need our own solution on self-hosted.

I would love the ability to use blob storage. I have a Rust repository and each matrix build has > 1GB of compiled files, so we can't complete a single build without causing cache eviction.

Would love to see this too, video game development with Unity creates huge Library folders, even for small scale games (ours is >3Gb), and each platform needs a separate cache. I'll have to take a look at https://github.com/shonansurvivors/actions-s3-cache, because we basically can't use Github caching for now (I would like to cache LFS files as well to decrease bandwith costs).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

KhaledSakr picture KhaledSakr  路  3Comments

gladhorn picture gladhorn  路  4Comments

s-weigand picture s-weigand  路  5Comments

ConorSheehan1 picture ConorSheehan1  路  4Comments

FacetGraph picture FacetGraph  路  3Comments