Go: cmd/go: permit marking a module as private in go.mod

Created on 30 Aug 2019  ·  22Comments  ·  Source: golang/go

The new proxy system requires individual machines environmental variables to be set up for private modules. This means knowledge about the build that is not kept in the repo. This is not ideal.

I believe a good feature would be the ability to mark a module as private in the go.mod rather than depending on each machine setting it's environmental variables correctly.

_Originally posted by @donatj in https://github.com/golang/go/issues/33980#issuecomment-526675684_

FeatureRequest NeedsInvestigation modules

Most helpful comment

It feels like we got rid of GOPATH only to replace it by a convoluted series of new GOPATHs (i.e. GOPRIVATE, GOPROXY, GONOPROXY, GOSUMDB).

All 22 comments

Definitely agree. Telling everybody at your company to "set the GOPRIVATE=..." seems impractical, and having this information lying in the go.mod file, and thus checked in, would be a more straightforward solution.

That would make it possible to pull a repo from a git server and built it, without configuring you environment.

Although I wonder how to solve the syntax-problem.
The modules-syntax seems to be fairly simple and I cannot think of an intuitive way to mark modules as private.
For example:

module myprivate.git/module

require (
    myprivate.git/dependency // private
    myprivate.git/another // private indirect
)

Feels clunky (and embedding machine-information in comments is not a nice solution imho)
Also, could you set the main module to private, too (is there a need for it)? If it is coming from a private repo, should it be noted as such?
e.g. module myprivate.git/module // private?

It feels like we got rid of GOPATH only to replace it by a convoluted series of new GOPATHs (i.e. GOPRIVATE, GOPROXY, GONOPROXY, GOSUMDB).

CC @bcmills @jayconrod

(I could've sworn I'd seen this issue before, but I can't find a duplicate right now...)

CC @hyangah @marwan-at-work @thepudds @rogpeppe

The major problem with indicating private repos in the go.mod file is that “private” is relative to the proxy.

For example: if your company is running its own (authenticated) corp module proxy, then it is entirely reasonable for the go command to be able to fetch your private-to-corp modules from that proxy when it is in use. If you switch to some other proxy — regardless of whether it is proxy.golang.org or something else — then those modules become “private” (because the proxy doesn't have access to them).

So “private” is fundamentally relative to the proxy, not relative to the module dependency.

I remember the issue very well, but I can't seem to find it either :/

However, I do sympathize with the issue here, because you can't force all of your team members to set up the correct configuration and environment variables on their machines.

And since "private" is relative to the proxy, maybe it's worth considering adding the GOPROXY variable to the go.mod file itself (as well as the GONOSUMDB and friends)

Or maybe, just be able to set any of the go env options in the go.mod file itself. However, the non zero value of an environment variable could take precdence.

This way, you don't need to set a "private directive" in the go.mod file, but still resolve the issue mentioned above.

That said, one can simply have a .env file in their repo and indicate in the README that those environment variables must be set before running a go command, so I'm also happy to not add that kind of complexity to go.mod.

(I could've sworn I'd seen this issue before, but I can't find a duplicate right now...)

I think it was discussed as part of the larger #25530 discussion, such as in https://github.com/golang/go/issues/25530#issuecomment-470761740 and some related comments there, as well as some of that discussion I think then triggered some additions to the proposal document, such as:

One possibility to further reduce exposure of private module path text is to provide additional ways to set $GONOSUMDB, although it is not clear what those should be. A top-level module's source code repository is an attractive place to want to store configuration such as $GONOSUMDB and $GOPROXY, but then that configuration changes depending on which version of the repo is checked out, which would cause interesting behavior when testing old versions, whether by hand or using tools like git bisect.

I don't recall a specific issue aside from #25530.

So “private” is fundamentally relative to the proxy, not relative to the module dependency.

That is true.
However, from your statement, it seems like running your own proxy will be "expected"?

For example, at a company with ~50 people, we have our own (private) Gitlab instance running. However, we are only ~5 devs actually using Go.
Setting up and maintaining a proxy for 5 people seems like overkill.
(Obviously, coordinating 5 people to set the same environment variable is doable, but is it nice?)

Normally, public projects will not depend on private repos.
This means that private projects will most likely live in the same environment as their dependencies.

That said, setting GOPROXY in the modules file would be needed for this to work.
Having a private proxy indicated in the go.mod file would mean no env-config would be needed, and "private" repos would not need to be declared.
Having no proxy (or a public one) set in the go.mod file would mean one could use "private" within the go.mod file to indicate that go should talk directly to the specified git server for these repos.

@tommyknows: I think it's expected that some organizations will want to, but certainly not everybody. However, keep in mind that (as touched on in https://github.com/golang/go/issues/25530#issuecomment-470761740) the go.mod file is immutable in your module's version control. If your company grows to the point where it does make sense for you to run your own proxy, you probably won't want it to be bypassed if someone happens to depend on an old module version from before the proxy was set up.

@tommyknows, if you configure a proxy in your go.mod file, and I write some other module that depends on yours, and we have annotated private modules separately, what should happen?

Moreover: suppose that I have five developers but fifty repos: now adding the environment variable is actually _less_ work overall than annotating all of the go.mod files individually.

At the other extreme, suppose that I have fifty developers and only one repo: now I don't need to annotate dependencies at all, because there's only one module containing private things and it doesn't depend on any other private dependencies.

The cost/benefit tradeoff you're describing is only favorable to the go.mod annotation in between those two extremes. Do we have solid evidence that that's where most developers fall? (Absent evidence, we should arguably bias toward not adding new annotations.)

You are right, definitely agree with that point.

What I don't like is that one needs to set an environment variable to not expose "private" information.

imho, the "default" value of GOPROXY (variable is not set / empty) should be to not use a proxy at all. Opt-In instead of Opt-out.

A goal could be to use information in go.mod (or another file checked into VCS) to mitigate the impact of "whoops, I didn't configure any env vars myself", or perhaps even "whoops, I misconfigured my env vars". While I'm not sure it would be possible to _eliminate_ the impact of those mistakes in _all_ cases, it might be worth thinking through what mistakes could be caught in what workflows.

The major problem with indicating private repos in the go.mod file is that “private” is relative to the proxy.

I think the real goal would be to avoid accidentally interacting with _public_ proxies and the public sum.golang.org. The most important public proxy to help people accidentally hit after a mistake would be the public proxy that is used in the absence of any configuration -- proxy.golang.org. In other words, one class of mistake is forgetting to make _any_ configuration, and in that case, the public infrastructure would be in use (and not any private proxy, given using a private proxy requires _some_ configuration). So maybe that could simplify the problem some?

And since "private" is relative to the proxy, maybe it's worth considering adding the GOPROXY variable to the go.mod file itself (as well as the GONOSUMDB and friends)

Adding the GOPROXY variable (or equivalent content) to go.mod might be problematic, but also maybe not needed if the focus is on avoiding hitting _public_ infrastructure in the face of a mistake.

You could imagine a go.mod containing information along the lines of:

no-public-infra-for-modules-matching-patterns=*.secret.git.com,*.something.else.com

(with a deliberately bad name chosen to emphasize I am not suggesting an actual syntax or actual name).

That could help some mistakes, for example if someone does a git clone of a repo that has a go.mod with that info.

However, it's not immediately obvious how go.mod-based info would help if someone does something like the following (without having first properly configured privacy env vars):

$ cd $(mktemp -d)

$ go mod init newmodule

$ cat > main.go
package main
import _ "secret.git.com/double/secret/import/path"
func main() {}

$ go build

You could make a non-default / user-specified privacy pattern a requirement for every go.mod before cmd/go will do anything else, but (a) that might not be popular, and (b) that still doesn't mean people would set it properly.

In summary, I am not sure a bullet proof go.mod-based solution is possible... but maybe the problem could be recast to focus on eliminating certain classes of mistakes (while not preventing other classes of mistakes)? Not sure.

Continuing on the "things that might not be popular" brainstorming tack:

A heavier-handed approach could be refuse to access the network until the user has taken some specific step regarding privacy settings (either in the current environmental settings, or based on go.mod).

For example, if you are starting a new module (and hence no go.mod-based privacy settings) and have no privacy settings in your user's environment, it could be an error to try to access the network, with some message along the lines of:

The go command cannot access the network until you have set a privacy pattern, 
or indicated you are opting out of privacy patterns. For example, you could:

 1. Set a privacy pattern. See 'go help privacy-patterns'.

 2. Opt out of privacy patterns for this module by running
    'go mod edit -no-privacy-patterns-for-me'.

 3. Opt out of privacy patterns for this user by running
    'go env -w no-privacy-patterns-for-me=true'.

See 'go help some-topic' for details and additional options.

Or maybe it does not key off of privacy patterns... but the general comment is that it could force the user to take _some_ action to eliminate the "whoops, I forgot to configure anything" type of error.

I like the idea of disabling default proxy value by a flag in go.mod, so that proxy can only be used if explicitly set with env var.

Or as a more flexible/complicated solution we can have tags on dependencies and only resolve such with appropriate proxies.

module myprivate.git/module

require (
    myprivate.git/dependency // acme
    myprivate.git/another // acme
)

GOPROXY=https://proxy.golang.org,acme+https://athens.acme.com,direct

imho, the "default" value of GOPROXY (variable is not set / empty) should be to not use a proxy at all. Opt-In instead of Opt-out.

Coldn't agree more with this point.

IMHO the defaults shouldn't be "expose info to a 3rd company/public proxy" event if that info isn't too much but import paths.

In my opinion someone that knows what is doing would easily set an ENV in order to get the benefits of the public proxy. On the contrary, someone that just clones the project and doesn't know about the proxies or is just using a new computer and forgot to set the envs would automatically leak info.

Not sure about the solution, I agree with @thepudds & @bcmills in the point that the go.mod solution might not suite all needs, and feels complicated.

A couple of ideas I can think of:

A. GOPROXY=direct by default. opt-in instead of opt-out as @tommyknows said.
B. Ask for confirmation preparsing the go.mod file when detecting non-public/known urls on it like gitlab.mydomain.com, mydomain.com etc and you don't have the GONOPROXY and friends set.

Am I correct to assume this won’t be fixed until 1.14? I’m not sure I like the idea of having to skip 1.13 to avoid simple user error leaking information.

(Also, Can someone please remember to update the app engine docs for cloud build and regular app engine deploys to avoid leaking private repos.)

@zaddok, Go 1.13 has been released, and this issue was filed well after the freeze window. The proposed changes will not happen in 1.13, but don't assume they will necessarily happen in 1.14 either.

Here's how:
in ~./gitconfig

[url "ssh://[email protected]/"]
        insteadOf = https://mydomain.tld/

then

go env -w GOPROXY=direct       
go env -w GOSUMDB=off
go env -w GOPRIVATE=mydomain.tld

Just one go env -w GOPRIVATE=mydomain.tld is enough, Go modules will treat your mydomain.tld-prefixed module paths in the state of GOPROXY=direct and GOSUMDB=off.

Just one go env -w GOPRIVATE=mydomain.tld is enough

The problem is that every developer with access to the code that has private dependencies needs to do this. If anyone forgets, the build will break on their machine. I'm personally also concerned that private module names will be sent to public infrastructure.

I'd also argue for those of us who have private repo's on GitHub marking the entire github domain as private isn't a great solution.

I'd also argue for those of us who have private repo's on GitHub marking the entire github domain as private isn't a great solution.

You can mark only your GitHub name or repo explicitly.

Like go env -w GOPRIVATE github.com/donatj or go env -w GOPRIVATE github.com/donatj/some_repo.

Details here: https://golang.org/cmd/go/#hdr-Module_configuration_for_non_public_modules. Pay attention to:

causes the go command to treat as private any module with a path prefix matching either pattern, including git.corp.example.com/xyzzy, rsc.io/private, and rsc.io/private/quux.

The major problem with indicating private repos in the go.mod file is that “private” is relative to the proxy.

For example: if your company is running its own (authenticated) corp module proxy, then it is entirely reasonable for the go command to be able to fetch your private-to-corp modules from that proxy when it is in use. If you switch to some other proxy — regardless of whether it is proxy.golang.org or something else — then those modules become “private” (because the proxy doesn't have access to them).

So “private” is fundamentally relative to the proxy, not relative to the module dependency.

In that case, maybe this a need for having something like . gorc, just like NPM's . npmrc, to specify all this?

Was this page helpful?
0 / 5 - 0 ratings