Athens: [Proposal] Support Multi Proxy Clients

Created on 8 May 2019  路  9Comments  路  Source: gomods/athens

The GOPROXY in 1.13 will become a comma-separated command with specific logic of how to move from one proxy to the next.

Athens currently assumes it's the only Proxy by always trying to go mod download modules that are not found in storage.

But with this new change in 1.13, we should give users the option to potentially chain different Go Proxies not just based on Import Paths (GlobalEndpoint/FilterFile), but by whether the Backend Storage is missing a module or not (or by combining both).

Therefore, I propose we add a new configuration called DownloadMode which will determine how to respond to a client when a module is not found in storage. Here is how it looks like in config:

# env override: ATHENS_DOWNLOAD_MODE
DownloadMode = "..."

The DownloadMode should accept the following values

  1. sync (default): download the module synchronously and return the results to the client.
  2. async: return 404, but asynchronously store the module in the storage backend.
  3. redirect: return a 301 redirect status to the client with the base URL as the DownloadRedirectURL as a new config variable (to avoid conflict with GlobalEndpoint).
  4. async_redirect: same as option number 3 but it will asynchronously store the module to the backend.
  5. none: return 404 if a module is not found and do nothing.
  6. custom: provide a more granular configuration of what to do when a module is not found in storage.

For number 6, a user might want certain import paths to be "synchronous", while other import paths to be "redirect" and maybe some to flat out return 404 through "none". I don't believe we should make it as complex as introducing versions into the mix but we can if there's enough demand for it.

To write all of this logic, we need a developer-friendly format. The two places for Athens right now is to include this in the Config file, or in the Filter File. But as I was prototyping with the code, both files seemed to not be a good experience.

  1. The config file is potentially a good place, however this means we force people to use a ConfigFile instead of env vars just because they want some custom downloadMode behavior. We also can't include env vars for custom downloadMode fields because Env Vars are just strings and not complex structures.

  2. The Filter File is potentially a better place. However, the Filter File is not concerned with the backend Storage. Furthermore, it's flat layout structure does not allow for complex nesting of options.

Therefore, I'm thinking something like HCL could be quite declarative. Take a look at the following example:

downloadRedirectURL = "https://proxy.golang.org"

mode = "redirect"

download "github.com/company-private/*" {
    mode = "sync"
}

download "github.com/company-public/*" {
    mode = "async_redirect"
}

What the above describes is that the default mode is redirect, but if the requested import path started with github.com/company-private then switch "downloadMode" to sync and also if it started with github.com/company-public/* then switch "downloadMode" to async_redirect.

If we ever wanted to include "versions" in the download file, we can do something like

download "github.com/company-public/*" {
    mode = "async_redirect"
    version "<1.3.2" {
        mode = "none"
    }
}

Which is pretty self explanatory.

This type of format could eventually allow us for now to separate the FilterFile (which is a global config that doesn't even check the storage) from the "DownloadFile". However, I believe we can potentially just use HCL for Filtering as well so that users don't have to learn a new format:

globalRedirectURL = "https://cool.proxy"

filter "github.com/malicious/repo" {
  mode = "exclude"
}

filter "github.com/vulnerable/version" {
  mode = "direct"
  version "<1.3.2" {
    mode = "exclude"
  }
}

downloadRedirectURL = "https://proxy.golang.org"

mode = "redirect"

download "github.com/company-private/*" {
    mode = "sync"
}

download "github.com/company-public/*" {
    mode = "async_redirect"
}

I'd like to submit a PR for this soon, but I thought I'd open the issue up for discussion first.

proposal

Most helpful comment

@gensmusic this is now ready to be tested in the canary build :v:

All 9 comments

@marwan-at-work I really like this idea! My comments are all related to the custom config.

First, could we make DownloadRedirectURL configurable for each download block?

Second, the good developer experience that HCL provides is going to be negated by the confusion between whether to use the filter file or the new HCL file, though, because the functionality is too similar. I think HCL is a natural fit for what you're trying to do here and we could expand your syntax to fit the filter config options too, so I suggest that we put everything into one HCL file and call that the v2 format. Here's how I think we should introduce it:

  • Freeze the current v1 format
  • Make Athens parse the file in the new v2 format If someone sets ATHENS_FILTER_FILE or FilterFile to a file with extension .hcl
  • Require that ATHENS_DOWNLOAD_MODE or DownloadMode is not set if the filer file is a v2 version (since it is set in the HCL file as you showed)

Thoughts?

@arschles

First, could we make DownloadRedirectURL configurable for each download block?

I was thinking about that too, and I think we should make that work, and in case of its absence, just default to whatever the global is. But we should validate that at least one redirect url exists up the chain.

Second, the good developer experience that HCL provides is going to be negated by the confusion between whether to use the filter file or the new HCL file

Agreed. The good thing is that we don't have much documentation on the Filter File (it's a bad thing, but in this case it's a good thing lol). So yeah, we could potentially freeze/phase out the Filter File over two releases and only write documentation for V2.

As for how the Config should look like, I have a couple of ideas that I'll put in better words and will paste them here soon :v:

What do we want to happen if the athens gets an authentication error when we try to access the upstream module?

@amnonbc in the context of this proposal, we're redirecting the client when the Athens storage is empty, and therefore we never check the status of "upstream" (VCS). Therefore, the client, being the Go command, will fail the build when the next proxy returns a 403.

However, AFAIK we return a 500 when go mod download gets an unauthenticated error. That's because go mod download does not return proper error codes. I have an issue open for that https://github.com/golang/go/issues/30134

That said feel free to open an issue if you think Athens should have a workaround until the above issue is resolved: such as parsing the error string so that we can return 403 instead of 500.

@marwan-at-work

But we should validate that at least one redirect url exists up the chain.

Sounds good!

we could potentially freeze/phase out the Filter File over two releases and only write documentation for V2

Agreed on the freeze, we can put a big banner at the top of https://docs.gomods.io/configuration/filter/ and redirect to a page with v2 syntax

I have a couple of ideas that I'll put in better words and will paste them here soon 鉁岋笍

Looking forward to it!

Filter file docs for reference: https://docs.gomods.io/configuration/filter/

any update?

@gensmusic PR is ready to be merged, awaiting one more maintainer approval.

@gensmusic this is now ready to be tested in the canary build :v:

Was this page helpful?
0 / 5 - 0 ratings

Related issues

opinionsDazzle picture opinionsDazzle  路  4Comments

chriscoffee picture chriscoffee  路  3Comments

marpio picture marpio  路  4Comments

arschles picture arschles  路  4Comments

arschles picture arschles  路  3Comments