Dep: Support for private/enterprise patterns

Created on 5 Mar 2017  ·  46Comments  ·  Source: golang/dep

Today, dep's primary target is Go's public, FLOSS ecosystem. The public ecosystem is the necessary shared baseline use case that must be satisfied.

Necessary...but not sufficient. Supporting private ecosystems and workflows not only makes things easier for Go users, but it can make it easier to convert private code to open source down the line.

There are a number of things enterprise organizations, or really any private organizations, may want:

  • Private code hosting (related: #174, #175, #263)
  • Policies to ban/bless projects, in their entirety or just a subset of their versions - #285
  • Seamless integration with monorepos
  • Integration with higher-order build systems (bazel, pants, etc.)

It doesn't seem like these changes should be a blocker for initially merging dep into the go toolchain, though they should probably be in good shape by the time the go toolchain ships a version with dep in it.

Epic before-toolchain-release

Most helpful comment

Apologies if this is covered elsewhere; I've had a look around and I couldn't find it.

The two big use cases I have as at least a semi-enterprise user that often seem to be poorly covered for me is:

  1. Being able to internally mirror repos and pull from those, even though we still use the "real" URL in the source itself.
  2. Being able to put a patch on top of an external repo for whatever reason (security, integration, etc.) and use this patched repo in preference to the "real" one.

Given the way Go works, I think it would suffice for both these use cases to be able to specify an "alias" or something for a package, i.e., "I want github.com/go/AwesomePackage but please pull source from internalgit.enterprise.org/gits/go/AwesomePackage" or something like that. That would be really useful for me.

I tried to read the source code to see if this is already there, but I don't think it is? I see "overrides" in the manifest but that appears to be related to relaxing version constraints.

All 46 comments

/cc @pbennett this would be a great place for you to summarize some of the requirements you have that we've discussed :)

Apologies if this is covered elsewhere; I've had a look around and I couldn't find it.

The two big use cases I have as at least a semi-enterprise user that often seem to be poorly covered for me is:

  1. Being able to internally mirror repos and pull from those, even though we still use the "real" URL in the source itself.
  2. Being able to put a patch on top of an external repo for whatever reason (security, integration, etc.) and use this patched repo in preference to the "real" one.

Given the way Go works, I think it would suffice for both these use cases to be able to specify an "alias" or something for a package, i.e., "I want github.com/go/AwesomePackage but please pull source from internalgit.enterprise.org/gits/go/AwesomePackage" or something like that. That would be really useful for me.

I tried to read the source code to see if this is already there, but I don't think it is? I see "overrides" in the manifest but that appears to be related to relaxing version constraints.

@thejerf it is, but no worries :) Have a look around here - https://github.com/golang/dep/issues/174#issuecomment-275805325. The source field in the manifest will allow you to individually specify an alternate location from which to pull a given dep. Roughly, an alias.

Doing them as more than one-offs is trickier, as described in the linked comment.

148 is one of the main issues I've been running up against using dep in my current project.

@lancehit #148 is one of those things that we may end up needing to devise an alternate approach for, rather than trying to support it more directly. Not sure yet, but it's possible.

Take this all with a grain of salt, I'm still wrapping my head around the Go import model.

We have a SVN monorepo and are looking at how we should structure Go code within it and how imports will work.

I'm curious what support for a monorepo would look like? Could you elaborate?

For example if we have this example monorepo structure:

trunk/
    pkg1/
    project/
        pkg2/

How do we get to the desired GOPATH structure:

$GOPATH/
    src/
        company/
            pkg1/
            pkg2/

I've already fought with go get and subversion and it seems more hassle than it's worth. Besides we want clean import paths such as company/pkg1 not svn.company.org:8000/trunk/ (which I realize is not a valid import path). Perhaps vanity imports or canonical import paths are of some use but I haven't spent enough time investigating.

Go doesn't really care how the code got into the $GOPATH so my current thought process is to checkout each project/pkg into $GOPATH/src/company as needed. This is obviously slightly annoying though. Each package that has internal dependencies will have to know which packages to check out.

I guess it woud be nice to have the manifest reference a local relative path in the monorepo with a different import name. Something like

{
    "dependencies": {
        "company/pkg1": {
            "source": "../../path/inside/monorepo/pkg1"
        },
    }
}

@sdboyer hey hey, this isn't exactly about go, but I bet you'd find some of the conversation in https://github.com/rust-lang/rust-roadmap/issues/12 very helpful. We're currently investigating this stuff in rust-land too.

Good luck!

@steveklabnik oh man, there's a _wealth_ of useful information there - thanks!

Grr, GitHub mobile moved the button into where the text field was just as I was tapping, sorry :)

@aajtodd If you have a monorepo, you will do best if you have a single directory that can be checked out into GOPATH. For example, svn.company.com:8000/trunk/go -> $GOPATH/SRC/go.company.com. It is greatly confusing when the directory structure in the repo does not match the package structure. (At that point, you don't have a monorepo, you have a bunch of separate repos that just happen to be stored in the same place.)

@quentinmit That's what I was leaning towards. And yes you are correct that it is currently more a bunch of separate repos/projects stored in one place. Not my choice, such is life.

To properly work inside an enterprise env, the system needs to allow the insertion of an enterprise repository for internally hosted as well as proxied content. It should be easy for a developer in this env to be able to direct all outbound requests to their internal repository similar to the Maven mirrorOf directive. This is mentioned to contrast other ecosystems like bower and npm where the metadata contains absolute urls and the client follows them faithfully. This means the implementation of an enterprise repository has to get in the business of rewriting all the metadata which potentially involves modifying the packages etc (depending on how the metadata is distributed)

I'm not completely up to speed on all the details of how the dependencies are declared, but having a good namespace scheme is also critical. I wrote a bit on my observations with npm to try and help the Jigsaw JSR not repeat the same mistakes. It's really applicable to any packaging system: http://www.sonatype.org/nexus/2017/01/23/advice-for-jigsaw-regarding-auto-modules/

Agree with @brianf on a number of points (file under not very surprising):

For go dep to work "well" in an Enterprise and in general I would hope that it has taken into account:

  • A solid coordinate system

    • This likely means for Go that you can specify name, version and architecture, group would be good too (to avoid name collisions in your distant future, that way multiple people or orgs can create a 'json' package), to get a specific package

    • Group already seems to kinda exist if you are following a github model, group being user/org, Name being repo name

  • The current model of having everything imported from git and stored in vendor (if I understand it correctly) seems to work well with a monorepo, but is kinda overkill for orgs not taking that approach, or using binaries where they don't need to change the code. A standard binary that can be imported using something akin to the coordinate system seems to be a good approach, and is supported by the long standing usage patterns of Maven, etc..

    • I have heard the vendor approach is being deprecated, seems good :)

  • If the model for using git to import projects remains, the ability to have some sort of resolver to allow third party tools like Artifactory or Nexus Repository to be used, akin to: https://bower.io/docs/pluggable-resolvers/

    • NOTE: I don't think this is ideal, but it's included since it's how Bower went about solving portions of this

    • This is a portion of solving proxying (very important for air gapped networks)

  • Something akin to a lock file would be good, whether it's through a pom.xml where you can state explicitly which versions you want, or generated akin to the yarn or gemfile locks

    • I think Go already supports this, since you can lock to a commit or branch, or whatever, but I think repeatability is the most important piece of the puzzle

  • As @brianf points out, you need to be able to publish packages for internal consumption. I realize this already works using the git import mechanism, monorepo type of approach, but again this is something worth considering for orgs that will not be using that type of approach either because of philosophical differences, or other constraints

Hi @brianf, @DarthHater! It's awesome that y'all are swinging by with thoughts 😄 sorry for the slow response - i did the thing where i backburnered responding "until i had proper time."

One note to make up front: a thing that makes dep rather different from other tools is that we rely, heavily, on static analysis to extract information for dependency management. This doesn't change the primitives at hand, of course, but it does mean that the information you're used to thinking about in a dep mgr will be coming from possibly surprising places.

the system needs to allow the insertion of an enterprise repository for internally hosted as well as proxied content. It should be easy for a developer in this env to be able to direct all outbound requests to their internal repository similar to the Maven mirrorOf directive.

For sure, this is definitely on the radar. Some fairly extensive discussion here: https://github.com/golang/dep/issues/174#issuecomment-275588019

As discussed in that issue, a direct analogue to mirrorOf is difficult, because that's mirroring an entire registry. We currently have no registries; each "source" is a VCS repository, so there is no central object to swap. Even if we do add them (which I would like to), we can't ever really deprecate the current model.

This is mentioned to contrast other ecosystems like bower and npm where the metadata contains absolute urls and the client follows them faithfully.

Yeah, we're somewhere in between. There is a facility for specifying a full URL on a per-dependency basis right now, but the typical input is just a Go import path, which we then pass through a rules engine (which, in nonstandard cases, can include HTTP requests to retrieve custom metadata) to derive a real URL. This is definitely a weaker point in the system, but it's a natural outgrowth of the way import paths work. My sense is that we have little room to change the underlying reality, so our design choices here have to aimed towards mitigating chaos.

but having a good namespace scheme is also critical.

Again, import paths change the game here a bit, because "names" in Go are import paths, and much of their meaning is already imparted by DNS.

However, if/when we do get to registries, I'm inclined to agree with having namespaces - that's why my little proto-example includes them.

A solid coordinate system. ...you can specify name

I've already mentioned some of the issues with our names/import paths. They're the weakest link in this, but because they're too broad, not because they're too narrow.

you can specify...version

Yup, we've got a whole system for this. Because we're all VCS-backed, we have concepts for branches and tags, with tags subdivided into semver and non-semver. Also, revisions (i.e. git commit hashes). And adaptive translation between these three.

you can specify...architecture

This one's interesting, and we haven't fully dealt with it just yet - #291. Things aren't on fire, though, because the current approach is _very deliberately_ global - all os/arch are unconditionally considered in solving. The issue there is about approaches to allowing the tool to safely focus on just a subset of os/arch.

The current model of having everything imported from git and stored in vendor (if I understand it correctly) seems to work well with a monorepo, but is kinda overkill for orgs not taking that approach, or using binaries where they don't need to change the code.

Yeah, from an enterprise org perspective, you end up duplicating a lot of source if you're committing vendor and have a bunch of different projects.

A standard binary that can be imported using something akin to the coordinate system seems to be a good approach, and is supported by the long standing usage patterns of Maven, etc..

Maybe I'm misunderstanding something here, but dep's scope of responsibility so far is entirely restricted to downloading source code - no binaries/nothing precompiled. We might expand to that later, but...well, in our case, I don't think it would end up looking very different.

I have heard the vendor approach is being deprecated, seems good :)

"Deprecated" might be a bit strong, but I do hope it can become far less common. That said, it's not going to be anytime soon. My (loose, skeletal) plan for it is here.

the ability to have some sort of resolver to allow third party tools like Artifactory or Nexus Repository to be used

Yeah, this is all tied up with the first item I responded to. I desperately want to avoid having custom resolvers. On the plus side, import paths being what they are, we can probably encode the information we need in custom import path patterns, then teach the tool the general patterns. This would allow us to

The more of this we do, though, the more we start having problems with names/import paths losing canonicality (for a given project, there could be a 1. github 2. registry 3. artifactory, etc.). This probably doesn't matter much within a project, but if my project deps on A, and both my project and A dep on B, but I dep on B's github representation, and A does it via B's registry representation, then we start to have nasty coordination problems. (This problem already exists via the gopkg.in service, though its impacts seem relatively minor for now)

Something akin to a lock file would be good, whether it's through a pom.xml where you can state explicitly which versions you want, or generated akin to the yarn or gemfile locks

Yep, we follow a two-file system, mostly per the principles I described in my article. The manifest describes constraints; the lock is the result of a solve, and provides reproducibility.

you need to be able to publish packages for internal consumption.

I think this one's mostly covered. Access/credential handling is a bit tricky (e.g. #264), because (ugh) we're still just shelling out to git/hg/bzr/svn, but because all those systems have their own ways of making themselves private, we're good.

And, fully I expect that if/when we get to registries, the implementation will be open, so folks can run their own private ones as needed.

I can also submit a use case for this. We currently use GitHub Enterprise hosted internally for our internal Go projects. Using Glide today, we currently have to specify in the glide.yaml file "vcs: git" for every package, otherwise we get an error: "Cannot detect VCS". For private repos, it would be nice if there was a way to augment the list of known repo URLs (i.e. github.com = git) to avoid this problem. We are also plagued by "go get" and other tooling issues because we have an internal certificate authority which is trusted, but yet the tooling doesn't recognize the trusted certs at the system level.

@deejross thanks for taking the time to provide feedback and use cases 😄

We currently use GitHub Enterprise hosted internally for our internal Go projects. Using Glide today, we currently have to specify in the glide.yaml file "vcs: git" for every package, otherwise we get an error: "Cannot detect VCS". For private repos, it would be nice if there was a way to augment the list of known repo URLs (i.e. github.com = git) to avoid this problem.

Yep, this one's definitely come up - it's the custom resolver stuff I referenced in my preceding comment, and that's discussed in #174. I think we're gravitating towards a decent solution here, albeit slowly 😄

That said, I _think_ that right now, you should be able to solve your problem by including the .git suffix at the appropriate spot on your import paths for everything stored in your GHE.

We are also plagued by "go get" and other tooling issues because we have an internal certificate authority which is trusted, but yet the tooling doesn't recognize the trusted certs at the system level.

Oooh. This is a super important one that I'd totally not thought of before. Please please, post a separate issue with more details! 🙏

The only issue I have with specifying the .git suffix is that is it stores the suffix with everywhere as part of the URL. If it could be removed from the URL and vcs: git be placed in the glide.yaml (or the equiv for dep), then would be ideal.

If it could be removed from the URL and vcs: git be placed in the glide.yaml (or the equiv for dep), then would be ideal.

Unfortunately (and confusingly), the problem is only secondarily knowing that it's a git repository. The primary problem is knowing, given an arbitrary path, which element represents the root of the project. Simply indicating vcs: git doesn't say anything about where the root is.

While I understand that including the .git is unpalatable. Many of the possible cures here are worse than the disease; the ones that aren't are a lot of work. If includes the .git infix does work, then we have to prioritize this behind addressing situations that simply cannot be made to work, at all.

I've read through this issue and #174 , but they both cover a lot (mirroring, rewriting, custom resolvers, etc.) so hard to parse out the exact portion I'm interested in.

Can anyone summarize the current thoughts/issues around the "simple" case of wanting to manage the imports/dependencies of a project that has imports from a private or GHE repository (such as github.company.com/org/project/gopkg)?

In this case, the import path (github.company.com/*) is still a unique identifier, so it seems like it should be sufficient if there was a mechanism for specifying how connections to certain prefix paths should be handled. For example, the manifest could contain some information/entry specifying that github.company.com is a GHE repository and allow the specification of an environment variable (or other mechanism) for authentication credentials.

This would certainly add edge cases (what if a primary project and dependent project specify different resolvers for the same prefix, etc.) and would possibly require implementing a set of adapters/connectors in the core project (GHE, private GitHub repo, other source hosting mechanisms, etc.), but not sure that it's unreasonable.

Will echo that I have observed that tooling like go get and govendor fetch may not work in certain GHE setups because an unauthenticated "get" is not permitted. I've generally worked around this by cloning such repositories into my local $GOPATH first and then using the mechanisms provided by tooling to import/vendor from the local $GOPATH , but it seems like that approach is out with dep.

If dep can't deal with GHE repos/imports, then it won't be possible to use it with any projects that exist in/have imports from GHE repos, which would be a real shame, so really hope that there's a path to resolution here!

Imagine this scenario...

A developer is working in his local playpen, using 'go get' willy-nilly to pull packages from all other the world.

However, when he is ready to submit his code to the corporate build pipeline, all of his imports will be fetched from internal mirrors of what is APPROVED for use by the corporate legal department, explicitly including specific branch/version numbers. The corporate build machines are even firewalled so they CANNOT contact the 'net. Everything used in a production build must be available from the internal mirrors.

This should be something that can be accomplished without modifying the source in any way, either the developer's code or the code in the packages used. This is because his code has been code-reviewed, and the import statements have been validated against the list of approved packages.

Will dep work as expected in this scenario?

I've taken an initial look through the set of issues with an intention of looking at solutions for #174. It seems that providing some method of allowing authentication, could be an ideal solution. This would allow access to the go-import meta tag provided by GitHub Enterprise.

If allowing private repositories is resolved that way, would it be possible that dep wouldn't actually need to solve that problem? If dep exposed some method of allowing augmentations to the http requests, maybe some package could be put together to determine if a url may need authentication, and attempt to collect the details from an external configuration on the system.

Understandably if the configuration is unavailable or invalid, it would prevent access to the imported GHE repo. That however may not be an issue, as in most cases with an enterprise environment some level of initial configuration would need to be done on the development machine anyways. In that case you really can't blame dep for an error caused by a misconfiguration you don't control.

Additionally adding some error checking for 4XX response codes may help in providing an alternate error message pointing towards authentication being the issue.

Happy to announce that myself and @SteelPhase with the great guidance of @sdboyer are working towards #174, #175, and eventually #176. Had some good discussions at the hackathon at GopherCon 2017's community day and even got a decent amount of code written to support a package registry that can be hosted internally and may eventually be used to create a global registry.

This opens a lot of doors for supporting not just private repos, but also adding security notices to releases and even allowing dep to notify you if the version of a package you're using has a known vulnerability. There's a host of other benefits from this approach and I look forward to having something soon that we can all take a look at and comment on.

I can offer a use case. I think it's covered by the "alias" idea; I think it's the same as what @Jamie-sas has.

We have a gitlab.corp.example.com server which is a mouthful when really we want to just use example.com/team/package in our imports. We can configure git with this:

[url "https://gitlab.corp.example.com/"]
    insteadOf = https://example.com/

which works fine for git clone https://example.com/team/project but go get example.com/team/project dies horribly because the company public webserver doesn't have any go import meta tags.

The expectation that I can get our ops team to go add some code to our public webserver to make my go development experience nicer isn't particularly realistic ;-)

Obviously the work around right now is simple; but for dep, it would be nice to say, map anything in this particular namespace to this scm/url.

example.com = git+https://gitlab.corp.example.com/
example2.org = hg+https://hg.corp.example.com/

etc. With sane default fallback behavior. Maybe this goes in Gopkg.toml?

Then the project can select what repository it wants to use for a given namespace, if needed.

And, we're not relying on git to do the right thing; if we move to go-git we don't really want people relying on .gitconfig. Which brings up authentication. Thats a separate issue.

Just a word of caution to this:

even got a decent amount of code written to support a package registry that can be hosted internally and may eventually be used to create a global registry...

I'm not sure that this is the correct approach to be taking.

Having a 'locally hosted repository' as your solution to:

  • Private packages
  • Local filesystem packages (ie. dev mode)
  • Local caches of public packages

Is the path that Nuget in the C# ecosystem has taken.

...and it's terribly annoying to work with. It's really a horrible experience, and the solution is a hacked on 'NuGet.Config' file that sits at the top level of your project and allows you specify 'local' folders that work as 'fake' repositories. It's a mess. No one likes it. Local dev with nuget packages is a huge pain. Let's not do that.

The longer these issues drag out, the more and more obvious it seems to me that the solution cargo uses resolve all of these issues without any need for additional meaningless complexity:

See http://doc.crates.io/specifying-dependencies.html, and specifically http://doc.crates.io/specifying-dependencies.html#specifying-dependencies-from-git-repositories

ie. You have an idiomatic resolution structure for dependencies, which resolves to public remote repositories, and for anything else you just specify any valid git remote.

../foo? Sure. [email protected]:repos/foo? No problem. Want to host your own cache? No problem, just manage your own gitlab clone of the public repositories and localcache/foo/blah. Want a public package registry? sure, just deploy a copy of gitlab.

so... basically.

  • go get is rubbish at private repos. Let's not be 'idomatic' and follow what go get does, when it's actually bad.

  • nuget is a great example of host your own private repo if you want to use anything that's not the public one and it's not great. Lets not repeat that mistake.

There's a lot of complex ideas being swung about on various tickets as solutions to this problem, and I think rather than just running off and implementing something, it might be worth thinking about using a much more simple and achievable approach, like the one cargo uses.

Why not just support arbitrary git remotes?

... and then, if we need to worry about having a central repository, we can make it be anything we like, so long as it offers a git remote endpoint.

This is a killer feature for dep; but only if it works, simply, obviously and well.

The internal registry we're looking at for this would be a stand alone service that would cover the basics needed to serve as a registry. The idea being you can stand up an internal server on your network and treat it as a registry just like the public version. The idea is still in it's early stages but I feel like it may avoid your concerns about the fake NuGet registry.

Having a custom registry service to install a single private package would be terrible workflow; that is literally the situation nuget is currently in.

The 'fake' nuget server is a work around for how bad that is.

I would like to say that some organizations cannot, for legal reasons, just download or publish packages from/to the internet. Each version used needs to be verified by one or more internal groups before it's allowed to be used. There's also the case of libraries written internally, that again for legal reasons, cannot be released to the public.

These cases are far more common than you might think. This is the whole reason why something like Sonatype Nexus exists; organizations need to control not only their own libraries but public libraries as well.

I'm not saying it's not painful, but it is needed, like it or not. We are thinking of not just GitHub or BitBucket here, but private repos. We wouldn't, however, want a public package to depend on a private package. However we should allow a private package to depend on a public package and that's what the registry will allow.

I would like to just clarify my objection here is in no way to supporting
private packages.

It is purely that having to run a service, external to dep, to achieve
that, is a bad workflow, and there is prior art to prove that.

An alias to map an import path to an arbitrary remote git url solves this
problem with the need to reengineer entirely new infrastructure; and that
should be fundamentally a part of dep.

...but lets see how this registry thing rolls out. I'm happy to play with
it and point out all the numerous problems once it exists.

On Jul 23, 2017 5:05 AM, "deejross" notifications@github.com wrote:

I would like to say that some organizations cannot, for legal reasons,
just download or publish packages from/to the internet. Each version used
needs to be verified by one or more internal groups before it's allowed to
be used. There's also the case of libraries written internally, that again
for legal reasons, cannot be released to the public.

These cases are far more common than you might think. This is the whole
reason why something like Sonatype Nexus exists; organizations need to
control not only their own libraries but public libraries as well.

I'm not saying it's not painful, but it is needed, like it or not. We are
thinking of not just GitHub or BitBucket here, but private repos. We
wouldn't, however, want a public package to depend on a private package.
However we should allow a private package to depend on a public package and
that's what the registry will allow.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/golang/dep/issues/286#issuecomment-317210391, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAVtsNh2Lx0L4yEtnS6NBgkWEjJa3pAUks5sQmO9gaJpZM4MTRv2
.

I think I'm starting to better understand your concern. The current problem with private packages is that Go, Glide, Dep, etc don't know what type of version control to use when specifying a package that isn't hosted by one of the popular source code hosting sites (i.e. GitHub) or doesn't specify the name of the VCS as part of the domain name.

One super simple option we discussed to alleviate this problem was to add an option to Dep that would allow you to specify the VCS type. After some conversation with various folks, it only solves one of many aspects of private packages. A registry service was decided upon to address several different issues. If there are lessons to be learned from something like Nuget, I'd be happy to look at some examples, but a registry is the strategic solution for Dep.

We also looked at expanding the output of a Vanity service as well as some other services that already exist in the Go ecosystem, but decided that instead of spending the next year over-engineering a solution, we should just jump in and build something simple that we could use as a starting point. The idea of a registry for Go packages does have some controversy surrounding it, but provides a comprehensive solution to existing problems with some pretty exciting integration possibilities for the future.

Our idea if a registry is a pluggable solution. A metadata backend and a binary backend; possibly starting with MongoDB for metadata and S3 for binaries. Obviously, these aren't concrete yet, but it's a starting point.

We'll just have to see how this rolls out in practice.

As long as we're prepared to treat this registry as an experiment, evaluate it, and if it turns out that hey, having to install mongodb to install a single private package is really awful, we're happy to throw it away and pick a different solution, I'm perfectly happy to keep rolling along this road.

+1 on requiring running Mongo (or really anything) being awful for this. Licensing, operations, access control, ...

How can this not be a text file or annotations in the dep files.

As mentioned previously, MongoDB was a possible first pluggable solution for metadata. We were planning on using local file system storage for testing and small environments, but if you plan to scale at all, especially in a public or enterprise way, you almost have to have a database backend of some kind for metadata to be shared between multiple running instances of the registry for load balancing. It could be Redis, MongoDB, MySQL, PostgreSQL, AWS Dynamo, and so on. The idea is that we want to make it easy to start and test with using the local filesystem (perhaps BoltDB), but make it easy to scale up if needed.

It is absolutely an experiment, and again we just need something to start with, prove out the concept, and build on it from there.

Yes, please, on the database! I would prefer the database be used _INSTEAD OF_ .lock or whatever files that get checked in with the source. In an Enterprise environment, a database is much more useful than files scattered everywhere.

Making the database choice 'pluggable' is also essential. It is highly unlikely your default will be anything like what we use internally, unless it is at least SQL-based for the metadata. Binaries are stored in something else, as well. At least if you make it work with generic Git (instead of assuming GitHub) for source, that part could work.

I have a theory that almost all companies that want an internal registry fall into one of two categories

  1. Small enough to be ok with (and maybe prefer) a simpler, but less scalable solution
  2. Big enough to set up (and maybe prefer) a general purpose repository manager like Nexus or Artifactory

As a quick sanity check...
Please thumbs down this comment if your company doesn't fall into one of these two categories

If this comment doesn't get downvoted to oblivion, then maybe we should be thinking about this as more of a simple reference implementation than a truly scalable solution.

Switching gears to package storage, the design discussion revolved around submitting tar.gz or zip files to the registry. No direct integration with GitHub or any other system for day one. This provides an added benefit for those that don't use Git for whatever reason; you're not tied to a specific VCS. Storage for these files we're thinking S3-compatible for the first non-local-filesystem storage option. There are several software and hardware-based S3-compatible solutions so that seemed like the easiest/quickest win.

In my experience you're exactly right. Each system comes along these days
with a basic tool but companies tend to not to want to run a separate
server for each new format/language and gravitate towards a heterogeneous
enterprise solution instead.

On Mon, Jul 24, 2017 at 1:42 PM, Zev Goldstein notifications@github.com
wrote:

I have a theory that almost all companies that want an internal registry
fall into one of two categories

  1. Small enough to be ok with (and maybe prefer) simpler local file
    system storage
  2. Big enough to set up (and maybe prefer) a general purpose
    repository manager like Nexus or Artifactory

As a quick sanity check...
Please thumbs down this comment if your company doesn't fall into one of
these two categories

If this comment doesn't get downvoted to oblivion, then maybe we should be
thinking about whatever we build here as more of a simple reference
implementation than a truly scalable solution.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/golang/dep/issues/286#issuecomment-317499551, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAD7CPGK8ne1gXj25kJS4UuSRRfBfQQ-ks5sRNeJgaJpZM4MTRv2
.

Big enough to set up (and maybe prefer) a general purpose repository manager like Nexus or Artifactory

I'm mostly trying to avoid putting my toe on the scales by staying out of this thread, but I feel the need to quickly note there that I think it is an error to equate the work @deejross and @SteelPhase are currently doing with the role served by e.g. artifactory or nexus. While there are certainly some possible areas of overlap, I'd say offhand there's at least as much that such a generic system is never likely to be able to provide.

Most importantly, running such a service need not be mutually exclusive with, and in certain configurations could even be nicely complementary to, a generic platform.

? Like what?

Of course, since I have no idea what this repository is supposed to offer as an API, I can't say for sure, but it's hard to see what's being built here as significantly different, abit a customized simplified version of those products.

You mean to say that the repository service being built offers an api that is more than just 'resolve this package' and 'download this package'? ...and perhaps, authenticate me? Like what? What 'go specific' features would it offer that something like nexus wouldn't?

I mean, I'm not really a fan of nexus at all, but I thought that this whole repo thing was about implementing basically a custom version of that to play with?

? Like what?

yeah, i should've known that chiming in would get this question 😄

i'll try to marshal my thoughts and compile a list of the relevant things, here.

but I thought that this whole repo thing was about implementing basically a custom version of that to play with?

FWIW, this has never been the frame from which i've been operating on this topic

S3 may work for some, but we don't use it in our corporate environment. We use an internal High Availability (multiple servers, failover, etc., etc.) Artifactory setup (which seems to be very popular with corporations in general) for binaries, and Git (and still a lot of CVS) for source.

Artifactory even comes with built-in facility to mirror GitHub areas, but ONLY the tarballs of the specific commits.

That's fine for keeping a version around, but it does nothing for actually using it during a Go build... we would have to provide extra tooling to retrieve the tarball from Artifactory, expand it into the 'vendor' subdir, and then run 'go build' or whatever.

I was really hoping for something a bit more transparent. In my playpen, I do 'go get ' and it goes out to the 'net and gets it. In the corporate build, whatever tool is used should have the same effect, but it should only get the package from the internal mirror, without me having to change URLs or anything else.

If dep can make that happen (however it does it... via a database, or whatever), we will be happy campers.

I think that the road to supporting a registry in dep can be quite simple by supporting a configurable, global http(s) URL to resolve from + an optional, separately configurable authentication key.
This can be done without changing the current model: 100% relying on DNS-based go imports as the canonical package identifiers, but not as the resolvable package URLs - these will be HTTP-delegated to the registry. Sources from the original URLs are automatically resolved and cached by the registry upon request.

As already alluded to in this thread, the registry will just have to contain frozen gz-tarballs for versioned dependencies, both for remote external dependencies and for locally deployed packages. Packages are usually stored in the registry in a similar way to how they are name-spaced in the 'vendor' directory, but with an additional tag/commit level. We successfully use the same approach in Artifactory for supporting CocoaPods, Composer, Bower and NPM dependencies - all based on source dependencies rather than binaries (npm has some exceptions, but the general concept still applies).
Happy to further help here!

Hi all, I have started coding the registry! As of now, it is very, very basic and only includes a couple of interfaces, but I hope it gets the idea of what we're going for across. Basically, pluggable binary and metadata storage with a REST API as the primary interface to the service. I haven't yet implemented any of the storage interfaces, but I will at least implement a local filesystem interface of some kind as a proof-of-concept. Additional backends will be easy to add after that.

If you see any potential issues with the skeleton I have out there now, please let me know: https://github.com/deejross/dep-registry

Thank you!

@deejross @sdboyer Will this (at some point in planned, or at least not ruled out, future) support the following case:

[email protected]:/git/httprouter.git   - a mirror of github.com/.../httprouter
[email protected]:/git/internal-lib.git - internal utils, "company.com/utils"
[email protected]:/git/app.git          - depends on httprouter and internal-lib

app.git will contain some configuration that says that _github.com/julienschmidt/httprouter_ imports in the source code can be found at [email protected]/git/httprouter.git with some commit/tag, and _company.com/utils_ can be found at [email protected]:/git/internal-lib.git with some commit/tag?

Also.. and if I'm doing some development of _company.com/utils_ library at the same time as developing app.git, I can temporarily change the configuration in my checkout of app.git to point it to the local path /path/to/dev/internal-lib?

This question, or at least similar ones, has been raised earlier in the comments, without any clear answer.

app.git will contain some configuration that says that github.com/julienschmidt/httprouter imports in the source code can be found at [email protected]/git/httprouter.git with some commit/tag, and company.com/utils can be found at [email protected]:/git/internal-lib.git with some commit/tag?

yep, these are definitely the sort of cases we're figuring out how to cover.

to point it to the local path /path/to/dev/internal-lib?

local paths end up being a very different question, entirely separate from registries. gotta deal with those separately. this gist has one possible longer-term view, and https://github.com/GetStream/vg may help you in the meantime.

  1. Developer is writing code in playpen, and includes something like ‘import “github.com/Jordan-wright/email”’

  2. Code works well, so developer pushes code for review.

  3. Review goes well, so code is submitted, and is picked up by nightly builds.

  4. Nightly build runs on machine(s) with firewall rules which PREVENT all external access, so ‘github.com’ is not directly accessible.

In this case, ‘deps’ would need to be able to resolve the import to an INTERNAL site, specified on the build system, preferably with multiple possibilities (command-line option, env var, etc.).

Whether it did this by replacing ‘github.com’, prefixing it, or whatever is of no consequence. There just needs to be a way to make it resolve to an internal system transparently to the code. All transitive dependencies should work transparently as well, with explanatory error messages when something cannot be found/resolved.

In other words, the code should work “as-is” with no modifications.

This scenario assumes a completely separate process for getting packages approved and populated on the internal system.

Stretch goals: Developers given a method to test what will happen when code is submitted for nightly builds. ‘deps’ supports a method of importing from external site, but uploading to internal staging area, independent of builds (like a common internal system separate from the developer’s playpen OR the nightly build machine(s)).

From: wade lee [mailto:[email protected]]
Sent: Wednesday, November 08, 2017 4:25
To: golang/dep dep@noreply.github.com
Cc: Jamie Cooper Jamie.Cooper@sas.com; Mention mention@noreply.github.com
Subject: Re: [golang/dep] Support for private/enterprise patterns (#286)

EXTERNAL

who can give me a example ?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/golang/dep/issues/286#issuecomment-342759060, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AQmvXLoLwPmNFrHk2dlUCQkHKhYPGqZMks5s0XNagaJpZM4MTRv2.

I believe I have a solution for private repos with ssh access, see PR https://github.com/golang/dep/pull/1717

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tapir picture tapir  ·  3Comments

deejross picture deejross  ·  3Comments

mildred picture mildred  ·  3Comments

rogpeppe picture rogpeppe  ·  4Comments

jiharal picture jiharal  ·  3Comments