Go: proposal: cmd/go: automatic and partial vendoring in module mode

Created on 14 Feb 2019  Â·  77Comments  Â·  Source: golang/go

This proposal overlaps with (and hopefully unifies) several existing issues, linked in the text below.

I'd like to implement it soon, in the 1.13 1.14 cycle, so if you have feedback please do respond quickly. 🙂

Problem summary

Users want a durable, local view of their source code that works with existing diff tools and does not require per-user configuration in cloned repositories.

  • Relying on module proxies does not necessarily satisfy delivery contracts.
  • Saved module caches do not interoperate well with version-control and code-review tools.
  • -mod=vendor requires configuration per user (GOFLAGS) or per invocation, and makes it too easy to ship code that produces a different build in vendored mode than in the normal module mode.

Proposal

Under this proposal, the source code for the packages listed in vendor/modules.txt — and the go.mod files for the modules listed in vendor/modules.txt, if any — will be drawn from the vendor directory automatically (#27227).

If a replace directive in the main module specifies a module path, the module source code will be vendored under the path that _provides_ the replacement, not the path being replaced. That preserves the 1:1 correspondence between import paths and filesystem directories, while allowing replacement targets to alias other modules (#26904). If a replace directive specifies a file path, then either that path must be outside the vendor directory or the vendor/modules.txt file must not exist (#29169).

Package patterns such as all and example.com/... will match only the packages that are present in the vendor directory, not unvendored packages from the same module. During the build, if additional packages from the vendored modules are needed in order to satisfy an import, the source for those packages will be fetched (from the module cache, if available) and added to the vendor directory. (Packages from outside the already-vendored modules will not be vendored automatically.)

Any time the go.mod file is written, if a module path found in vendor/modules.txt has a different version than that found in the build list, the already-vendored packages and go.mod file from the previous version will be deleted, and updated versions of those packages will be written in their place (#29058). Transitive imports of those packages will be resolved, and may populate additional packages in other already-vendored modules.

If go get removes a module from the build list entirely, its package source and go.mod file will be removed, but an entry for the module (with version none) will remain in vendor/modules.txt. That way, if a future operation (such as a go get or go build) adds the module to the build list again, it will remain vendored as before.

When go mod tidy is run, it will add or remove packages from the vendor directory so that it continues to contain only the subset of packages found in the transitive import graph. It will also remove go.mod files and entries in vendor/modules.txt for modules that are no longer present in the build list.

To encourage the _minimal_ use of vendor directories, the go mod vendor subcommand will accept an optional list of packages or modules. go mod vendor <module> will update the vendor directory to contain the go.mod file for <module> and source code for its packages that appear in the transitive import graph of the main module. (Note that, since the criterion for inclusion of a package is its existence in the import graph, vendoring in an additional module should not affect the contents of any previously-vendored modules.)

go mod vendor <pattern> for an arbitrary module pattern will add # <pattern> to vendor/modules.txt, and vendor in the go.mod files (and any packages found in the import graph) for modules matching <pattern>, adding individual comments to vendor/modules.txt for those modules.

Note in particular that go mod vendor all will copy in go.mod files for all of the module dependencies in the module graph (and add entries in vendor/modules.txt for those modules). That ensures that after go mod vendor all, go list can produce accurate results without making any further network requests (see also #19234 and #29772).

The go mod vendor subcommand will accept a new flag, -d. go mod vendor -d <pattern> will remove all previously-vendored modules matching <pattern> from the vendor directory (and from vendor/modules.txt), as well as any previously-stored patterns matching those modules (including <pattern> itself, if present).

go mod vendor, without further arguments, is equivalent to go mod vendor all. go mod vendor -d is equivalent to go mod vendor -d all. If go mod vendor -d causes vendor/modules.txt to become empty, it will also remove the entire vendor directory.


Edits

FrozenDueToAge Proposal early-in-cycle modules

Most helpful comment

It definitely makes sense to support partial ones. I just suspect (and might be wrong!) that 90% of users opting into vendoring really mean to vendor everything, and a reasonable chunk of that 90% would be surprised by it behaving otherwise.

All 77 comments

(CC @jayconrod @rasky @JeremyLoy @theckman)

To encourage the _minimal_ use of vendor directories, ...

Why are partial vendor folders something we want to encourage? Most use cases listed here and in the linked issues would require all dependencies vendored at all times.

Also, can you clarify if go get or go mod tidy would ever add new modules to the vendor folder, or if running go mod vendor would still be required after every new dependency is added in order to avoid a partial vendor folder?

Why are partial vendor folders something we want to encourage? Most use cases listed here and in the linked issues would require all dependencies vendored at all times.

Some dependencies are more robust than others. For example, you might trust github.com to be generally available, but want to vendor in dependencies that happen to be hosted using bzr or svn so that you don't have to install those tools on every machine that will build your module.

Also, can you clarify if go get or go mod tidy would ever add new modules to the vendor folder, or if running go mod vendor would still be required after every new dependency is added in order to avoid a partial vendor folder?

go get and go mod tidy would not add dependencies to vendor automatically.

We could perhaps make go mod vendor (without arguments) set some flag in modules.txt to indicate that all additional modules should be vendored.

More generally, though, the main goal of automatic vendor updates is to prevent version skew. Copying in newly-added modules does not further that goal, since there are no out-of-date contents in the first place.

I would suspect the most common use case might be vendoring 100% of dependencies?

If so, and if vendoring in 1.13 is going to be able to track updates via go get and go mod tidy in _some cases_, it would seem that once you have signaled you want automated tracking it likely should be the _default_ behavior at that point to be 100% complete in any automated tracking, rather than defaulting to partial tracking? (For example, track all updates after a go mod vendor with no args, as you suggested two comments back)?

It definitely makes sense to support partial ones. I just suspect (and might be wrong!) that 90% of users opting into vendoring really mean to vendor everything, and a reasonable chunk of that 90% would be surprised by it behaving otherwise.

In the presence of a reliable proxy, I can't think of any reasonable cases for a partial vendor directory, and lots of possible confusion.
I would personally argue we go in the other direction, as in if you try to build in vendor mode it is not allowed to see anything outside the current module (except the stdlib)

@ianthehat, one use-cases for vendoring, given proxies, is to vendor in private code for which the proxies do not have access.

For example, a contract-based startup might want to vendor in their proprietary utility modules before delivering the code to their customers.

@bcmills Could you comment on the interplay with -mod=readonly, and/or options to disable automatic downloads for people who would prefer to fail if vendor is missing something?

@bcmills you can achieve the same effect by copying to an internal package and rewriting the import paths, which would be more honest, and also allow for local modifications (something else that those kinds of contractors often need to do as well). If you don't want to rewrite the import paths, you could check it in as a sub-module and use a replace directive (you probably have full control of the main go.mods for that kind of work)
Or you could add a directory with the zip and mod files and use it as a file proxy (which is something it might be worth looking into as a better version of vendoring)
I don't think making the normal use much worse for such extreme edge cases would be the wrong choice.

I like removing complexity, flags, and per user (or per project) configuration when using vendored mode. I think that automatic detection of vendor folder (and assume you are in vendored mode) when a vendor folder is present it's a great idea.

I sometimes mix vendored and non-vendored projects, and switching between would be great to be as transparent as possible.

I agree though with the opinion of some of the folks above, IMHO supporting partial vendoring would be confusing and it will add complexity.

For example, in our usecase, we are using non-vendored mode for our main projects, adding a GOPROXY for public libraries, but don't want to cache our private libraries there (for security, and because cache server and source server are on the same local network, it just doesn't add any benefit for us). https://github.com/golang/go/issues/26334 would be enough for this.

Vendored mode, in the other hand, it's great to distribute self-contained/small apps/tools.

@ianthehat

I would personally argue we go in the other direction, as in if you try to build in vendor mode it is not allowed to see anything outside the current module (except the stdlib)

Part of the point of this proposal is to avoid the need for a distinct “vendor mode”. Modules are integrated into the normal go workflow, and if we're serious about supporting vendoring, then I would argue that vendoring should be integrated too.

Or you could add a directory with the zip and mod files and use it as a file proxy (which is something it might be worth looking into as a better version of vendoring)

We've considered that, but it really doesn't work well with version control systems: the diffs are incomprehensible and the blobs can end up consuming a lot more space than they ought to (depending on the encoding).

Re partial vendoring: given module proxies, the major use-case for vendoring is for modules that are not available via the public proxies. (Recall that the word “vendor” _literally means_ “one who sells”.)

Module mode substantially reduces the need to duplicate code: you no longer have to copy all of your dependencies into your own repository, and you especially don't need to do that for stable, publicly-available, open-source dependencies. It is important to me that we make it easy to duplicate the _minimum_ amount of code necessary for each use-case: minimal duplication shouldn't be an “extreme edge [case]”, it should be the default mode of operation.

It's not realistic to expect folks to manually apply replace directives for partial vendoring, or to rewrite import paths. It's certainly possible, but it's extremely tedious (see #30241 and #27542). It isn't, and shouldn't be, a default mode of operation. If that were the only alternative to vendoring the full tree, folks wouldn't do it: instead, they'll fall back to duplicating all of the dependencies all over again.

The point of vendoring in module mode is not to provide an _alternative_ to using modules. It is to provide a _complementary_ feature set for the cases that modules cannot address well: namely, the distribution of proprietary code.

That said, let's think about that sticky-pattern problem. I don't buy the “full vendoring as a default” argument, but there is a more general case that really ought to work.

Suppose that I run go mod vendor golang.org/x/.... I should reasonably expect any further dependencies matching golang.org/x/... to be vendored.

If we support that, then we can view go mod vendor without arguments as equivalent to go mod vendor all, and that will provide sticky full-vendoring.

So how about this alternative. For a given module pattern,

  • go mod vendor <pattern>

    • adds # <pattern> to vendor/modules.txt, and

    • vendors in the go.mod files (and any packages found in the import graph) for modules matching <pattern>, adding individual comments to vendor/modules.txt for those modules.

  • go mod vendor -d <pattern> removes from vendor/modules.txt:

    • <pattern> itself, if present;

    • all modules matching <pattern>;

    • and finally, all further patterns that match the removed modules.

And then go mod vendor is defined to be equivalent to go mod vendor all.

@thepudds

Could you comment on the interplay with -mod=readonly, and/or options to disable automatic downloads for people who would prefer to fail if vendor is missing something?

Under this proposal, -mod=readonly would continue to disable updates to the go.mod file, but any imports already listed in vendor/modules.txt that are found during a go build would be copied into the vendor directory.

-mod=vendor would continue to exist, and would mean “do not resolve imports that are not found in either GOROOT or vendor”. However, since we would now vendor in go.mod files as well, go -mod=vendor would produce more accurate results from subcommands like list, mod why, and mod graph that examine the structure of the module graph.

So how about this alternative. For a given module pattern,

  • go mod vendor <pattern>

    • adds # <pattern> to vendor/modules.txt, and
    • vendors in the go.mod files (and any packages found in the import graph) for modules matching <pattern>, adding individual comments to vendor/modules.txt for those modules.
  • go mod vendor -d <pattern> removes from vendor/modules.txt:

    • <pattern> itself, if present;
    • all modules matching <pattern>;
    • and finally, all further patterns that match the removed modules.
      And then go mod vendor is defined to be equivalent to go mod vendor all.

I think this works very well for me, thanks. I couldn't reason through all the cases you listed in your original post (I'll try to go through them over the weekend), but surely this command line API looks good and the sticky mode is really good.

Is there really a need to introduce a third metadata file (vendor/modules.txt), after go.mod and go.sum? Did you think of adding a vendor command to go.mod?

@bcmills In addition to the proposed new behavior described above, is the thinking that this would also land in 1.13:

  • #27348 "cmd/go: allow verifying vendored code"

If so, under the latest proposal, is this an example of what a module author could do if they want to fail if vendor is incomplete:

  1. in the author's own builds or in their CI, they could run with -mod=vendor to fail if vendor is incomplete
  2. for consumers, the author does not have control over what consumers do (and relying on a README stating "please set -mod=vendor" is not a desired solution). However, if the author runs go mod vendor (no args), that provides a complete vendor directory on an on-going basis based on the proposed automatic tracking behavior, and in addition the author could run go mod verify -vendor (or go mod vendor -verify or whatever incantation) to verify that vendor is both correct and complete? And if go mod verify -vendor is successful (say, prior to releasing a new version of a module), the author would have confidence that a consumer would never automatically download new code to populate vendor (even if the consumer is not running with -mod=vendor or -mod=readonly)?

Is there really a need to introduce a third metadata file (vendor/modules.txt), after go.mod and go.sum? Did you think of adding a vendor command to go.mod?

I hadn't really considered it: I think @rsc added vendor/modules.txt in 1.11, and given that it's already there I figured we could keep using it.

I suppose that we could record the patterns in go.mod instead, but I have a mild aesthetic preference for keeping them in modules.txt. I'm certainly open to arguments to the contrary, though. 🙂

Updated the proposal to incorporate sticky patterns (https://github.com/golang/go/issues/30240#issuecomment-464071411).

We've considered that, but it really doesn't work well with version control systems: the diffs are incomprehensible and the blobs can end up consuming a lot more space than they ought to (depending on the encoding).

If your use case is because the code cannot live in a public proxy, why do you care about the diff, you would not see the diff if it was in the public proxy. It's also trivial to fix, use a non compressed text archive. This also fixes the space issue.

Part of the point of this proposal is to avoid the need for a distinct “vendor mode”. Modules are integrated into the normal go workflow, and if we're serious about supporting vendoring, then I would argue that vendoring should be integrated too.

I think we ought to start by enumerating the actual problems we are hoping to solve with vendoring, and checking it is the right solution to those problems. Vendoring comes with a lot of serious problems, it needs to be worth the cost.

If your use case is because the code cannot live in a public proxy, why do you care about the diff, you would not see the diff if it was in the public proxy.

For a start, if you're vendoring the code because it is proprietary, you want to be sure that you are shipping only what was actually promised to the customer.

(In contrast, if the module is already publicly available, you probably don't care which parts you're re-publishing in your vendor directory.)

It's also trivial to fix, use a non compressed text archive.

That is essentially what the vendor directory is: it just happens to be text archive format that can also be consumed by pre-module versions of the go command.

I think we ought to start by enumerating the actual problems we are hoping to solve with vendoring, and checking it is the right solution to those problems.

https://golang.org/wiki/ExperienceReports#vendoring is a good place to start.

Specific use-cases I'm aware of are:

  • Shipping proprietary dependencies to customers under contract.
  • Providing reproducible builds for Go releases that predate module support.
  • Unpacking source code for temporary auditing (such as using grep -r) or ephemeral debugging (such as inserting println statements).

One interesting use-case that this proposal does not address is:

  • Preparing long-term or upstream patches for dependencies (#27542, #28176).

One use-case is sometimes mentioned, but arguably better served by a module proxy (or, equivalently, a saved module cache):

  • Providing hermetic builds for CI systems.

The latter use-case is not a significant factor in this proposal, although it may be addressed by this proposal incidentally.

  • Shipping proprietary dependencies to customers under contract.

This is the interesting one, and I still think vendoring is totally the wrong choice for this.
We redestribute modules using a proxy, this is just a custom proxy, so lets do it that way. Vendoring is a horribly confusing to the user way to distribute a module.

  • Providing reproducible builds for Go releases that predate module support.

I don't think this is an interesting case, you are talking about adding features to the go command, to use them you inherently must have a go command that supports modules, and that's our story for reproducible builds. I don't think it is unreasonable to say if you want reproducible builds you should upgrade to modules.

  • Unpacking source code for temporary auditing (such as using grep -r) or ephemeral debugging (such as inserting println statements).

This is exactly what the replace directive is supposed to be for, if it does not work well, lets fix it, not suggest vendoring as an alternative.

Suppose that I run go mod vendor golang.org/x/.... I should reasonably expect any further dependencies matching golang.org/x/... to be vendored.

@bcmills To make sure I understand correctly, that would include any future packages that you obtain via go get ...? If so I don't agree that you should expect it, because you're requiring people to need to manually do it in the first place. We're training them to not expect it.

Another vendoring use case:

  • Providing reproducible builds for Go software without the need for a highly available proxy, or automation to synchronize Module caches between CI systems and developer workstations.

If your use case is because the code cannot live in a public proxy, why do you care about the diff, you would not see the diff if it was in the public proxy.

For a start, if you're vendoring the code because it is proprietary, you want to be sure that you are shipping only what was actually promised to the customer.

Which is true for any module you ship, and you would have more confidence in this if you are using modules as the distribution mechanism because you know the version and checksum in the mod file are correct, which is much better than trying to diff the source.

(In contrast, if the module is already publicly available, you probably don't care which parts you're re-publishing in your vendor directory.)

It's also trivial to fix, use a non compressed text archive.

That is essentially what the vendor directory is: it just happens to be text archive format that can also be consumed by pre-module versions of the go command.

Except it is not, the vendor directory is real code that you can modify, have no idea if it has been modified, and breaks assumptions about import paths. It causes many problems for tools, and much confusion for users.

A local module proxy however has none of these issues. It will not appear as code in your editor, and unlike vendor directories it would be totally safe to use the module proxy in all your dependancies as an alternative source of modules. You can also still control the version across multiple dependancies unlike vendoring.

@theckman, note that a filesystem tree is a valid module proxy.

That means that “without the need for a highly available proxy” is equivalent to “without the need for a reliable filesystem”, and if your CI system does not have a reliable filesystem you're probably not going to have a good time with go build anyway.

(But that's mostly irrelevant, because we plan to have highly available public proxies anyway: see https://blog.golang.org/modules2019.)

To make sure I understand correctly, that would include any future packages that you obtain via go get ...?

Yes, or any future packages that you obtain by adding an import of a package and letting go build resolve that import.

Basically, under this proposal (with “sticky vendoring”), go mod vendor <pattern> tells the go command to always keep the vendor directory up-to-date for everything matching <pattern>.

@ianthehat

you are talking about adding features to the go command, to use them you inherently must have a go command that supports modules, and that's our story for reproducible builds.

I'm talking about adding features to _manage_ the vendor directory. Previous versions of the go command back to 1.6 can _consume_ the vendor directory, and (unless I have made a serious mistake, which is possible) the directory tree created under this proposal remains compatible with them.

That means that “without the need for a highly available proxy” is equivalent to “without the need for a reliable filesystem”, and if your CI system does not have a reliable filesystem you're probably not going to have a good time with go build anyway.

@bcmills if I build inside of containers, I will intentionally not have a reliable filesystem within CI. It'll be completely blown away after the resulting binary is yanked out of the container.

(But that's mostly irrelevant, because we plan to have highly available public proxies anyway: see https://blog.golang.org/modules2019.)

Google, much like NPM, has not been 100% immune to service-interrupting issues. It's the nature of the business, and so I don't feel your comment is a valid counter-point. People will deploy bad code or configuration, systems will fail, etc.

  • Unpacking source code for temporary auditing (such as using grep -r) or ephemeral debugging (such as inserting println statements).

This is exactly what the replace directive is supposed to be for, if it does not work well, lets fix it, not suggest vendoring as an alternative.

If you like, you can think of vendor/modules.txt as defining a set of replace directives, along with a set of rules to update those replace directives based on future changes to the build list.

That's actually where I started this design: with the question, “what would it look like if we unified vendoring and replacements?”

The interesting part is the automatic-update step, since we want the auditable form of the code to stay in sync with the actual modules in use. vendor is what you get if you produce the source code from the modules; replace is what you get if you produce the modules from the source code.

@theckman

Google, much like NPM, has not been 100% immune to service-interrupting issues. It's the nature of the business, and so I don't feel your comment is a valid counter-point. People will deploy bad code or configuration, systems will fail, etc.

I would be very surprised if Google were the only organization maintaining a public module proxy. If Google's proxy goes down, you can set GOPROXY to point to someplace else and keep on working.

@ianthehat

the vendor directory is real code that you can modify, have no idea if it has been modified

That's #27348, and I expect we'll address it by 1.14 (if not yet in 1.13).

@bcmills Now that introduces further complexity and additional trust issues around that proxy and whether they are safe for use, further convincing me that my call-out for use case is valid. If the main _official_ proxy goes down, I'm confident the overwhelming majority of users are not going to spend the development cycles to identify a trusted alternative proxy, and to then update all of their CI configurations to use it.

@theckman, the issues of proxy trust are certainly valid, and they will be addressed in-depth in other proposals for Go 1.13. Stay tuned!

However, those details are a bit off-topic for this proposal.

At any rate: while I don't agree that reproducible builds are a compelling use-case for vendoring, the current proposal supports that use-case anyway.

We can (and, I hope, will) come to agreement on a design even if our underlying reasons for it differ.

  • Shipping proprietary dependencies to customers under contract.

This is the interesting one, and I still think vendoring is totally the wrong choice for this.
We redestribute modules using a proxy, this is just a custom proxy, so lets do it that way. Vendoring is a horribly confusing to the user way to distribute a module.

I think we have different users. With vendoring support, my customers don't even need to be aware what vendoring is. I provide a git repository, they clone it and run "go build"; everything works, and everything is self-contained (shipping all the dependencies in source code format is part of many of my contracts, and vendoring allows me to honor this clause very easily and effortlessly).

You seem to suggest that your customers/users are happy with setting up "custom proxies" (whatever that means) in addition to cloning the project for the purpose of building it, possibly using additional documentation that you have provided them with. I'm happy it works for you, and I can assure you that you will not find me passionately arguing against custom proxies in GitHub issues just because I don't use them.

If you don't have a use case for vendoring, please rest assured that there are many people who do, and there have been multiple discussions going on on GitHub issues and directly with Google to try to have proper vendoring support with modules as well. If you feel so strongly, please open a proposal to remove vendoring from the Go toolchain, but I think it's better not to hijack this issue by arguing on the actual merits of vendoring.

@theckman @bcmills As far as I can see, I think you might be agreeing with each other that you both seem to view that a filesystem-based module cache that is checked into VCS is not a valid substitute for a traditional vendor directory.

To evaluate the proposal here, it is valuable to spell out use cases serviced by a traditional vendor directory and evaluate those against future proposed behavior, and you can describe and compare behaviors of traditional vendoring, etc.

But at least for me, if you wanted to try to put it in a single sentence, the most important part of evaluating whether or not a future solution is a valid substitute for a traditional vendor directory might be:

_Allow tracking third-party code in VCS in a similar manner to how you track your own code._

(Different people might want that for different reasons. For example, some people might think some set of the following: VCS is the ground truth for your source code and should be the ground truth for other code you use or ship even if someone else wrote it; greater trust in your VCS than any other external system; VCS is more likely to survive over longer timescales or migrate forward in the face of change; the desire to use standard VCS tools (diff, tagging, blame, bisect, etc.) on third party code in a similar manner to how you use them with your own code; building when disconnected from the public Internet; supporting reproducible builds; internal policy; compliance; auditing; etc.).

My personal view is a file-based proxy or module cache checked into VCS does not meet that threshold. One example is a file-based proxy I think has N copies of different versions of a dependency under file directories like v1.2.3, v1.2.4, etc., which is not how you track your own code in VCS. That then has different implications for how you interact with that dependency code.

But I also suspect different people would boil it down to a different single sentence, and perhaps the topic is too nuanced to try to capture in a single sentence.

I think we have different users. With vendoring support, my customers don't even need to be aware what vendoring is. I provide a git repository, they clone it and run "go build"; everything works, and everything is self-contained (shipping all the dependencies in source code format is part of many of my contracts, and vendoring allows me to honor this clause very easily and effortlessly).

You seem to suggest that your customers/users are happy with setting up "custom proxies" (whatever that means) in addition to cloning the project for the purpose of building it, possibly using additional documentation that you have provided them with. I'm happy it works for you, and I can assure you that you will not find me passionately arguing against custom proxies in GitHub issues just because I don't use them.

Sorrry, I was not clear.
I am suggesting that checking in a proxy directory that the go command knows how to automatically use would be a superior solution to checking in a vendor directory if this was the only concern. It meets all the requirements for a stand alone build in a way that unifies with modules cleanly, and has none of the drawbacks of a vendor directory.

@bcmills I agree with your problem summary and proposed solution. This is a great proposal. I went back and forth on the module pattern. I think it is more then athletics to put the patterns in modules.txt, as it pertains only to the act of vendoring, and not to module or go versions. So I agree with your proposal on that as well fully.

Thank you. This will make modules even easier to use for myself and my team.

Change https://golang.org/cl/162989 mentions this issue: go/analysis: allow overriding V flag without code patches

This proposal has an interesting interaction with #27852. If we start storing go.mod files in vendor directories, then nothing in the vendor directory will be included in the module cache.

That's probably fine, because building within the module cache doesn't work in general anyway (modules may have replace directives pointing to files outside of the module), but — that being the case — we could trim out a lot of code bloat in the module cache by explicitly excluding the vendor directories associated with cached modules.

This all makes sense for "package main" type modules, but what happens if you run go mod vendor under a library package with this proposal?

@driusan, why would it need to be any different for modules containing libraries?

(Note that in module mode, we only ever use the vendor directory of the main module. Packages vendored by dependencies are not searched and not relevant to the build: that way, the main module has complete control over which dependency versions are in use, and there is only one non-test copy of the source code in use for each package import path.)

@bcmills Your note is why I think it should be different (and should probably result in an error).. if it behaves the same way regardless of whether it's a main module or a library module, it would result in spurious vendor directories that are never used (under any circumstances) when run under a library package, which is probably not what the user intended.

The vendor directories would still be used when working within that library module itself, and when compiling the program in GOPATH mode (for example, using an older version of the Go toolchain).

The use-cases for vendoring in library modules are a bit less compelling than for modules that include main packages, but as far as I can tell that difference mostly only affects the decision about whether to use a vendor directory at all — not what it should do if present.

(Note that the inclusion of go.mod files under this proposal will tend to exclude the vast majority of vendor directory contents anyway.)

Change https://golang.org/cl/165378 mentions this issue: all: add -mod=vendor to GOFLAGS in tests that execute 'go' commands within std or cmd

@bcmills
My experience report:
I vendor all the dependencies, I do not modify the vendor folder (I do not edit or patch sources there). My build is offline and fails if I have a missing dependency. No magic, I'm happy.

My expectations from new tools provided by Go itself:

  1. Keep the old behavior of the vendor folder (which we had for years, literally);
  2. Work outside of GOPATH (which simplifies development).

Please provide a simple and reliable solution which satisfies the community's expectations.

I've read both big threads and I see that there are basically two use cases:

  1. Vendor all the dependencies.
  2. Vendor none.

And one potential use case:

  1. Vendor some.

I haven't seen experience reports with this one (maybe I missed one, these threads are super big). But this potential case smells like nobody likes complexity like setting up a proxy with access to private repositories.

Thanks!

@selslack, thanks for the report, but it would help even more if you could share the _why_ of your current workflow instead of the _what_. (In particular: why, if at all, would you prefer to vendor packages rather than modules? Why, if at all, would you prefer to vendor source files rather than zipfiles?)

We know how things work today, but we also know that for at least some use-cases there will be better alternatives in module mode. I want to optimize the vendor functionality for the subset of use-cases that don't have a better alternative.

@bcmills sorry for responding a month late, and thank you for writing this proposal!

So lets walk through a scenario, and please correct me if I get anything wrong:

  1. I have a preexisting go project.
  2. I have a need to add a dependency to a project, and I wish to vendor it
  3. I run go mod vendor _pkgname_, which downloads the source code into the vendor directory
  4. I update my project to use the new dependency
  5. go build _mainpkgname_ works

This is a slightly different scenario than what I was originally envisioning, which was making go get vendor aware.

It's not bad, still just one command: go mod vendor _pkgname_.

Because partial vendor support is something mentioned above, I think this is an adequate compromise. It does feel a bit odd though. It definitely doesn't feel as nice as go get effortlessly switching between go path and module mode.

Its also not _quite_ what the title describes. "automatic vendoring in module mode". Its not really automatic if its using a different command.

Is there a specification documented for vendor/modules.txt?

I want to update my own vendor tool because we seem to be going round in circles here and making vendor'ing way more complicated than it needs to be.

Hi @JeremyLoy

This is a slightly different scenario than what I was originally envisioning, which was making go get vendor aware .

It's not bad, still just one command: go mod vendor _pkgname_.

As far as I understand the current proposal, the common case is that you would do a one-time operation of go mod vendor(which is a synonym for go mod vendor all).

Once you've done that, you've signaled your desire to use vendoring, and at that point the vendor directory will automatically track any subsequent go get [email protected] and go get bar and also even if you add a new import path to your code for a previously unused module and do something like go build. In other words, once you do the one-time operation of go mod vendor for one of your projects, you would not need to separately do go mod vendor foo and go mod vendor bar. Was that part of your concern? Sorry if I have misunderstood the concern.

@thepudds that may in fact be the case. My initial impression from reading the proposal was as you described, but all of the follow up discussion in this thread regarding support for partial vendoring is where I am confused.

I personally don't see a need to support partial vendoring, at least for initial release. It just complicates the issue. Simply making go get vendor aware for this first pass doesn't exclude partial vendoring from a future release.

@bcmills what happens if I have a Module I depend on named all defined in my go.mod, and I only want to vendor that? How do I only vendor all and not github.com/theckman/example too?

@JeremyLoy

@thepudds that may in fact be the case. My initial impression from reading the proposal was as you described, but all of the follow up discussion in this thread regarding support for partial vendoring is where I am confused.

The history of the conversation here is slightly confusing. The initial proposal was a bit different, so the first 18 or so comments above were reacting to that initial proposal. The proposal was then updated at the time of https://github.com/golang/go/issues/30240#issuecomment-464071411 in a way that largely addressed many of the initial concerns in those first 18 or so comments. I think the change in the proposal at that time also addressed what I think was the primary concern you were expressing in https://github.com/golang/go/issues/30240#issuecomment-474541877. I understand that you also expressed concern that you might not need partial vendoring, but I think what might have been your primary concern around the common case of go mod vendor automatically tracking future go get foo and similar commands is part of the current proposal (without the need to also do go mod vendor foo).

In other words, if you read the proposal as it stands now in the first comment https://github.com/golang/go/issues/30240#issue-410509265 and mostly like what it currently describes, then that is a good sign. As far as I am aware, the proposal as it stands now in the first comment is a complete description of the current proposal.

@nomad-software, there is no current formal specification for modules.txt, and I don't intend to provide one: as demonstrated in this proposal, the format may be subject to change (although it should remain broadly compatible in the face of any changes).

The programmatic entry points for tools to interact with vendor in module mode are go list and go mod vendor. Anything beyond that should follow the proposal process.

@theckman

what happens if I have a Module I depend on named all defined in my go.mod, and I only want to vendor that?

Module paths without a dot in the first component are in general reserved for the standard library: it seems exceedingly unlikely that there will ever be a module with a literal module path of all that you can go get or go mod vendor.

That ensures that after go mod vendor all, go list can produce accurate results without making any further network requests (see also #19234 and #29772).

It occurs to me that this might not actually remove the need for network access — at least not without significant changes to module-mode loading. It is possible that some dependency found in the module graph (go list -m) is only reached through an earlier-than-selected version of some other dependency, and since the vendor directory would contain only the most recent version of the go.mod file, that part of the module graph could be missed if we only consult the vendor directory.

I suspect that go list without -m would be fine, but go list -m in particular might still need network access.

@bcmills would it be reasonable to populate the older go.mod files as well in the vendor directory during a go mod vendor under this proposal?

This might not be the proper analogy, but in other words, would it be reasonable to place in vendor the go.mod files you would end up with if you manually did something like GOPATH=$(mktemp -d) go mod download in 1.12 today (or similar if that is not correct in 1.12)?

@thepudds We would need to put the go.mod files at some path that contains the complete (presumably canonical) version as a path component: otherwise, they might overlap with the go.mod files vendored for other versions of the same module. The module cache, not the vendor directory, is where we put per-version files today, and adding a similar facility to the vendor directory would be a significant overlap.

I suppose that goes to @ianthehat's broader point about the overlap between the module cache and the vendor directory.

We would need to put the go.mod files at some path that contains the complete (presumably canonical) version as a path component

Yes, agreed.

I suppose that goes to @ianthehat's broader point about the overlap between the module cache and the vendor directory.

The glass-half-full way to look at that might be that the logic to track go.mod files in a version-aware manner would not need to be invented from scratch for vendor ;-)

Or you could add a directory with the zip and mod files and use it as a file proxy
(which is something it might be worth looking into as a better version of vendoring)

That's issue #31302

We've considered that, but it really doesn't work well with version control systems: the diffs are incomprehensible and the blobs can end up consuming a lot more space than they ought to (depending on the encoding).

That's only the case if you want to keep everything in a single VCS repository. In a large organization, where curating external code is done by many people/teams, you want to split the vendoring in separate repositories, that produce read-only modules, that are then consumed by all the organization projects.

The only reason all this stuff is in single huge vendor directories right now is that there was no robust way to share curating results in Go. Now we have one, that's modules + goproxy (#31304)

what happens if I have a Module I depend on named all defined in my go.mod, and I only want to vendor that?

Module paths without a dot in the first component are in general reserved for the standard library: it seems exceedingly unlikely that there will ever be a module with a literal module path of all that you can go get or go mod vendor.

IIRC modfile.Parse will error out if there is not at least one dot in the module name (for the branch exposed in github.com/rogpeppe/go-internal/)

@bcmills an additional experience report for me, but it's easier to quote @selslack

I vendor all the dependencies, I do not modify the vendor folder (I do not edit or patch sources there). My build is offline and fails if I have a missing dependency. No magic, I'm happy.

This is exactly it for me, and why so far I've avoided using go modules and kept my dep+vendor workflow.

Why? The target environment for a project has no internet access, and it is a requirement for us to be able to build it in that environment. Vendoring has made accomplishing this incredibly simple and I never have to worry about whether my project will build correctly.

Sure, there are other ways of solving this problem as mentioned above (caches, etc), but vendoring is so incredibly easy to use/understand that I don't see why I would bother.

A side effect of committing all my dependencies to a vendor git repo is that it makes it really easy to audit any incoming changes to dependencies and watch for unexpected changes. I admit a filesystem cache as mentioned above could accomplish this same goal.

I'm running out of time in the 1.13 cycle. I still want to make this happen, but unfortunately it's going to slip to 1.14.

@bcmills: What do you think about an alternative solution, where you'd flip a switch in go.mod to turn on "auto vendor" mode. When a module is in "auto vendor" mode, the following things would happen:

  • All changes to the dependencies (go get ...) would automatically be vendored to /vendor/...
  • All respective commands (go build/go run/go test) would always run with -mod=vendor

I feel like this would be the most sane solution for me, and pretty much comparable to dep or glide workflow which worked a treat for us for a long time.

Edit:
To be more specific about my comment above: I'd prefer it, if having a /vendor directory would be sufficient enough to signal the Go tools that I want 100% of my dependencies vendored all the time and that all tools should run in -mod=vendor mode. But I understand, that this approach is not really something the Go team considers, so maybe having a setting in go.mod is.

@bcmills I want to add a recent story of how the proper vendoring saved us a lots of time: https://success.docker.com/article/docker-hub-user-notification.

As a part of security review after receiving this notification -- we performed an audit of all the dependencies in Java, NPM, etc.

Auditing our Go code took exactly 0 seconds, because we have all the dependencies committed and we don't go online during build process at all.

@bcmills: What do you think about an alternative solution, where you'd flip a switch in go.mod to turn on "auto vendor" mode. When a module is in "auto vendor" mode, the following things would happen:

  • All changes to the dependencies (go get ...) would automatically be vendored to /vendor/...
  • All respective commands (go build/go run/go test) would always run with -mod=vendor

I feel like this would be the most sane solution for me, and pretty much comparable to dep or glide workflow which worked a treat for us for a long time.

Edit:
To be more specific about my comment above: I'd prefer it, if having a /vendor directory would be sufficient enough to signal the Go tools that I want 100% of my dependencies vendored all the time and that all tools should run in -mod=vendor mode. But I understand, that this approach is not really something the Go team considers, so maybe having a setting in go.mod is.

I agree.

I think we should consider actually removing -mod=vendor flag, and moving it to a per-project configuration of some sort.
With that flag as it currently is, we're not going to solve the problem that is outlined as a part of this issue's original posting: configuration per user via GOFLAGS. It would still be required for some projects as long as -mod=vendor has the meaning that is actually depending on what are you trying to do in a project (current task) rather than on what project you're doing it (global task).
To be more specific, I'd still want to have -mod=vendor enabled all the time to force go to never try to load anything from the network without explicit go mod vendor invocation in some of the projects I'm working on.

Following a good discussion at GopherCon with @ChrisHines, we concluded that a key reason for needing vendor today (in his situation at least) is touched on by the second bullet point in @bcmills' description:

Saved module caches do not interoperate well with version-control and code-review tools.

Put another way: vendor is used because there isn't a better alternative to reviewing dependency changes alongside (and as part of the same process as) changes to one's own code.

We further concluded that all other aspects of his requirements (including reproducible builds, self-contained CI runs etc) could be satisfied by alternative means, not least, for example, an approach similar to https://github.com/golang/go/issues/27618. None of these alternatives are currently as polished/easy as the vendor workflow, but they would do the job (and could become more polished).

Back to the point on reviewing dependency changes. This point is obviously critical. Not only for those people who prefer the vendor flow because they can easily solve this problem as part of their existing work flow, but for everyone who uses Go modules. We can't, today, point to a tool that helps us achieve this.

That said, to avoid the problems of only having parts of modules "vendored", I think this points towards a solution where entire modules are "vendored" with a directory structure similar (identical?) to that found under $GOPATH/pkg/mod. Whether it's all modules or some I defer to others. This keeps modules intact (important in keeping the solution simple for tools etc) and retains the current benefits of being able to review dependency changes alongside one's own changes. Whether this is achieved by implicit replace directives or any other means, I again defer.

Apologies if I'm late to the party on all of this: I just wanted to stress/highlight that this point on code review has a life well beyond this issue that is of much wider interest.

On that back of my last comment I've just raised https://github.com/golang/go/issues/33466

I think this points towards a solution where entire modules are "vendored" with a directory structure similar (identical?) to that found under $GOPATH/pkg/mod

Just to slightly row back on this point: whether we need the _entire_ module to be "vendored" is actually something I defer to Bryan (and Ian) on. If cmd/go can make things work in a partial way then I don't actually see a reason to "vendor" the entire module (indeed, given the code review point there is good reason not to). Because go/packages et al will just "work" because cmd/go "works"

I previously wrote "entire module" because I read (perhaps incorrectly) that this had become necessary. But Bryan/Ian are the authorities on that point, so restating that point for clarity.

I am withdrawing this proposal in favor of #33848. My reasoning is as follows:

Should vendored dependencies be updated automatically?

Here I have proposed that go commands should update and/or add to the contents of the vendor directory automatically.

However, in the time since then, we have observed that users are often confused by the implicitness of updates to the go.mod file. Given that, we probably _should not_ overwrite existing contents in the vendor directory without explicit user intervention — that style of automatic vendoring would add yet another layer of substantial changes driven by the same implicit mechanism, and while diffing and reverting changes in the go.mod file is relatively easy, diffing and reverting unexpected changes in the vendor directory is not.

I now believe that we should _not_ make such updates automatically.

Should we allow vendoring of only a subset of packages?

Here I have proposed that go mod vendor should accept patterns to allow users to vendor only a subset of modules.

I still think that's a good idea in concept, particularly for repositories that contain multiple interdependent modules and replace directives, but it adds enough complexity that it should be considered separately from — and presumably _after_ — changes to automatically use and/or maintain the vendor directory.

It would great to vendor a single module. We have a library that has a directory with .yaml files(openAPI types common for multiple services). These files are used by other projects, they include these files in their own API specifications. Now we vendor all dependencies, and can invoke a command from makefile that generates go code from service api spec and types from lib (can reference them ./vendor/someRepo/file.yaml )

Was this page helpful?
0 / 5 - 0 ratings