@bradfitz mentioned go doc
could special-case node_modules for exclusion.
If you have many client/server projects in $GOPATH, node_modules
can quickly bloat and cause go doc
to resolve slowly—this is especially noticeable in editors which use go doc
to display inline documentation (see vscode issue).
I recursively removed my node_modules
(15gb, crazy I know), here are the results before and after:
$ time go doc something
real 2m25.068s
user 0m17.772s
sys 1m45.039s
$ time go doc something
real 0m19.955s
user 0m2.499s
sys 0m9.343s
/cc @robpike
@bradfitz mentioned go doc could special-case node_modules for exclusion.
That was done as a result of #16417, which you opened in 2016 and is still open.
I presume this would extend the following bit of go help packages
:
Directory and file names that begin with "." or "_" are ignored by the go tool, as are directories named "testdata".
Or is it fine for go doc
to skip certain directories, while other commands like go build
and go test
wouldn't?
Now that I think about it, gorename
took ~5-10 minutes before as well, I'll have to try it now. It does seem a little hacky to special-case directories specific to some other language/framework, but I can't say I've ever wanted to operate against anything in node_modules
either so it probably wouldn't be harmful
To generalize this behavior, should we also ignore top-level directories under .gitignore
? That should take care of node_modules
as well.
This can't be done in just cmd/doc. If we're going to do it, it should be globally, in cmd/go, go/build, etc. (And we can't start parsing .gitignore
.)
It's unfortunate to have the definition of Go packages depending on or working around problems specific to other languages. On the other hand 15 GB is pretty big.
If we do this, we have to write:
Directory and file names that begin with "." or "_" are ignored by the go tool,
as are directories named "node_modules" or "testdata".
That's a bit strange, to call out Node explicitly like that. Are there any other giant directories to avoid too?
We could also make it user-configurable in some way, but then the meaning of things like 'go list foo/...' or 'go install foo/...' becomes configuration-specific, which we've tried hard to avoid.
I've been watching this as I've been affected heavily in the past with this via goimports
and more recently go doc
.
I agree that it is unfortunate to have to call out a specific language (and all derivatives of), however in the scope of web development as a practice today with node_modules
being ubiquitous for front end development, it doesn't seem unreasonable.
I've also had this problem with Carthage. Looking at my .goimportsignore
, I also see a few packages that are Go wrappers around large C projects that have lots of unrelated code and that do in-place builds, leading to large numbers of build artifacts.
Some more discussion here - https://github.com/golang/go/issues/16427. I agree that we should make it user-configurable. Hardcoding node_modules
just seems odd to me.
At the expense of adding new knobs, I think having a .goignore
file will help a lot (given that we already have .goimportsignore
). And that file can serve as a common base which can be inspected by all other tools (go doc
, gorename
, goimports
).
I agree that adding .goignore
would be better than special-casing node_modules
.
Please don't create more dot files. If the tools need configuration, use a single file.
The suggestion is basically to rename .goimportsignore
to .goignore
so there would be a single file.
It occurs to me that if we adopt #30411 then another possible approach would be to use an environment variable.
What about
go env -w GOIGNORE=
(as Ian suggested)? This would apply to go doc
and so on but not to the extraction of modules from your Git repos. If the files go into your git repos, they will go into the module zips.
We've been circling various options here for a while. This seems like the least bad.
go env -w GOIGNORE=
Ignore lists like this tend to grow slowly over time. This is a nice fit for a text file, which is easy to edit and append to. Env vars are much less edit-friendly. Every time I type export PATH=“$PATH”:
I experience a Rob-like twinge of exasperation. (Or anyway, I imagine it is Rob-like.)
Ian suggests that we might also consider a special file _in_ the directory to be ignored. For example, we could say that when the go command walks into a directory and finds a file named .goignore
or .goskipdir
or .gorobots.txt
, it turns around and walks back out immediately. So if you needed a node_modules inside your Go module working directory, you'd just put this magic file in there too.
Thoughts? Name bikesheds?
That can be a bit annoying to work with for directories that sometimes get removed and then replaced with newer versions or for directories owned by a tool that then needs to be told to ignore the sentinel file.
It semi-makes sense to include the blacklist in go.mod
since it would be listing directories that, though they happen to be contained in the module root, are not actually part of the module.
Ian suggests that we might also consider a special file in the directory to be ignored.
A major problem that I see is that it will need .gitignore'd directories to be in the filesystem. For eg. node_modules
are not checked in to a repo. But in this case, we would need an empty node_modules
to be present, just for that special file to reside. Does not feel very elegant.
Moreover, this will create multiple files peppered throughout the repo compared to a single file. Also, now it becomes harder to know what directories are ignored at once.
What is the advantage of this solution over having a single dot file, since both are using a special magic file anyways (which I thought was the main point of contention) ?
Typically the directories that should be ignored don't have to live in a Go repo at all. After all, by definition they don't contain any Go code. So the advantage of a file in a directory is that only people who create that directory will need to put the file in it. If we use a single dot file, it's hard to avoid putting that file into the Go repo, but now people who create a directory that doesn't contain any Go code have to maintain a local patch to the Go repo. (That argument does not apply to .gitignore
since git supports putting ignore patterns outside of the repo.)
(I'm not saying we have to do it that way, I'm just answering the question about the advantage of this approach over a single dot file.)
Android has a .nomedia
empty file and I think it works nicely. I think it would be a cleaner solution than adding more config files or more env vars. If anyone wishes to mix Go with other tools/languages in a single repository which produce tons of files, I assume they can make sure to also add such a file as necessary.
Thanks for the clarification Ian.
The only issue I see is that new users cloning a node repo have to traverse inside the node_modules
folder and create the special file everytime, as opposed to the folder ignore path being already there in the single-file solution. But your point on maintaining local Go patches is valid as well. I am fine with both solutions.
So the advantage of a file in a directory is that only people who create that directory will need to put the file in it.
I'm concerned that this won't play well with external dependencies, either managed via git submodules or via some package manager. To get the go tool to ignore them, we'd have to add a Go-specific file to a completely non-Go repository; I can't imagine other projects being happy about that. The alternative is to add the file yourself after downloading the code, but dirtying the repo will also be an annoyance; it'll cause git submodule pain, any package manager that checks directory contents for security purposes may fail, etc.
There really shouldn't be these directories inside module trees to begin with, so hopefully this is a rare case.
In this rare case, is it unreasonable to expect people to put .goignore
files in those directories? If a script is deleting and recreating that directory, it can create the file, right? Is there an established non-Go-specific convention? (.nomedia
doesn't sound right, but anything else?)
@rsc sounds fine to me! But if the tooling doesn't fall back on GOPATH maybe this is a non-issue going forward
The discussion has meandered a bit, but the original reporter at least (@tj) is happy with the ".goignore
in the directory to be ignored" solution, and this will become less of a problem as GOPATH is phased out anyway.
Does anyone object to moving forward with this .goignore
behavior?
I somewhat share Josh's concern that we are pushing the responsibility of modifying Go-specific behavior to a non-Go user. But overall, it is a +1 from my side.
Happy to work on it if this gets accepted.
Still not happy about this, and now that GOPATH is going away it seems even less suitable.
Does anyone object to moving forward with this .goignore behavior?
No objection.
this will become less of a problem as GOPATH is phased out anyway
The problem will, however, still remain for the main module (which is our use case at least).
To get the go tool to ignore them, we'd have to add a Go-specific file to a completely non-Go repository
@josharian what workflow are you thinking of where this would be necessary? External dependencies will not likely have things like node_modules
committed (per @rsc in https://github.com/golang/go/issues/30058#issuecomment-486372656). And if an external dependency does have a large directory that contains non-Go code that should be ignored, then I think it's reasonable to ask for a .goignore
to be added (by definition, the external dependency must have _some_ Go code).
We could close it for now and revisit this later if it's a problem with Go modules. Now that tooling won't be scanning all of my client/server apps full of node_modules this shouldn't be a huge problem.
In an ideal world there would be a standard like a .nocode
or .codeignore
file that would be language+tool agnostic. I don't think that exists, so it seems to me like .goignore
is reasonable.
this will become less of a problem as GOPATH is phased out anyway
The problem will, however, still remain for the main module (which is our use case at least).
I think the ordering of my response above might have given the impression that I wasn't objecting to the proposal _despite_ the problem still remaining for the main module.
Just to clarify, the proposed .goignore
will _also_ work within the main module, hence I have no objection to the proposal.
If it's only about node_modules there is work being done to replace it with PnP https://yarnpkg.com/lang/en/docs/pnp/
Interesting, thanks for the insight @kaey. That seems to support @tj's idea that perhaps we can revisit this later if it continues to be a problem in the future.
Given @tj and @mvdan's comments, it sounds like we can just do nothing for now, learn more about how modules work, and come back if we need to. So this sounds like a likely decline.
Leaving open for a week for final comments.
@rsc are you referring to @tj's comment in https://github.com/golang/go/issues/30058#issuecomment-537871142?
This problem remains in a Go modules world, where node_modules
(or similar) exists within the main module (we have exactly that situation in our mono repo). Hence in https://github.com/golang/go/issues/30058#issuecomment-537870038 I agreed with the proposal of using a .goignore
or similar.
Therefore I would suggest it's still worth fixing.
how about moving the node_modules or any vendor out of the git repo? maybe
one up dir and just export its path variable?
On Sat, Oct 12, 2019 at 5:05 AM Paul Jolly notifications@github.com wrote:
@rsc https://github.com/rsc are you referring to @tj
https://github.com/tj's comment in #30058 (comment)
https://github.com/golang/go/issues/30058#issuecomment-537871142?This problem remains in a Go modules world, where node_modules (or
similar) exists within the main module (we have exactly that situation in
our mono repo). Hence in #30058 (comment)
https://github.com/golang/go/issues/30058#issuecomment-537870038 I
agreed with the proposal of using a .goignore or similar.Therefore I would suggest it's still worth fixing.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/golang/go/issues/30058?email_source=notifications&email_token=AGUV7XW7JLPVPYIVVA7OOZ3QODTBLA5CNFSM4GT22VXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBBGQRY#issuecomment-541222983,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AGUV7XXWA7ZAMBNQGKDWG23QODTBLANCNFSM4GT22VXA
.
It sounds like there is not yet consensus on whether this is needed. If we assume that some people could use this, does anybody object to doing it? Leaving open for further discussion.
-- for @golang/proposal-review
This is a feature that will add files to git repositories, and will be very hard to remove or undo in the future. If we're not sure about it, I think we should default to not doing anything for now.
Therefore I would suggest it's still worth fixing.
I think what was meant is that this problem was far worse in the GOPATH world than in the modules world. Any one GOPATH is likely to be far larger than any one module. Moreover, module archives downloaded from proxy.golang.org don't contain directories that aren't Go packages, so I'd imagine node_modules/
wouldn't even exist for downloaded dependencies (forgetting about the main module for a second).
Just chatted to @mvdan about this. I'm going to row back on my conclusion in https://github.com/golang/go/issues/30058#issuecomment-541222983. Yes, the problem does still remain within the main module (for us it adds ~500ms to a go list ./...
invocation), but I think I agree with the conclusion that, despite this, it's impact is significantly less than in a GOPATH world. Furthermore, it's not something that is going to affect a sufficiently large number of users as to warrant such an invasive change.
Moreover, module archives downloaded from proxy.golang.org don't contain directories that aren't Go packages
I think this previous statement of mine was wrong, as pointed out by @myitcv. Still, it's rare to have to scan the files in one's dependencies. Usually this only happens within the main/current module, like Paul mentioned.
Directories can be excluded from module archives by placing empty go.mod in them AFAIK
I'm not convinced there is consensus here about what to do, so I think we should probably hold back and do nothing.
Agreed, moving back to final-comment-period as above.
I'm not sure if the file name bike-shedding time is already over or not,
but since we have go.mod
and go.sum
, why not
also have a go.ignore
? Or, even better,
go.away
. This would also address @robpike's concerns about
the proliferation of dot-files.
There was one comment (by @myitcv) after https://github.com/golang/go/issues/30058#issuecomment-540110056 marking this as a likely decline, but he withdrew it. Since there are no other comments, declining.
I would like to add one more use case, that is purely Go related.
Consider a private Go library, that I want to publish docs for (also privately). I think it's a common use case. One way to solve it is to use godoc
, grab HTML content with wget
, commit it to repository and setup GitHub Web Pages targeting that folder in my private GitHub.
There's quite a lot of content might end up in that directory, also some of the files might have weird names thankfully to wget
containing characters like :
ultimately breaking module zip logic and preventing it from being consumed as a dependency.
Overall I think it's pointless try to capture all the possible use cases, it would be safer to just assume the fact - users will want to have whatever content in the repository that they should be allowed to exclude from internal Go file scans.
I would like to propose to reopen this issue.
I just encountered a need for something like .goignore myself.
I have a repo with some non-go files, as well as subdirs that are working dirs and can fill with huge numbers of .o
's and other non-go files[1]. I never had problems in the past because I simply pointed GOPATH to a dir within the repo, and the majority of non-go files happened to be outside that dir.
I recently open sourced the code [2], and (unsurprisingly) people aren't too happy that they must either set GOPATH or rewrite all the import paths. I started re-arranging the repo content to get rid of the setup requirements, but in doing so the dirs with myriad non-go files are brought into scope for go generate ./...
, go test ./...
, and the like. _If_ the user builds with the build system (mage), that's easy enough to work around - but not otherwise. And getting people to use the build system doesn't solve issues with goimports or other tools an IDE is likely to run.
Having some sort of file that can tell go tooling to not scan certain dirs would be very nice. I don't particularly care whether it's one file in the repo root containing dir paths or wildcards, or an empty file in each dir to be skipped, or what; neither do I care whether it's a hidden file or visible - it just needs to _work_.
Modules may alleviate the problem, but definitely not until my upstream dependencies work with them. If I understand correctly, modules aren't compatible with build-time code generation; that means a significant amount of work as well as lost features - so not a near-term solution.
Separating the go and non-go files into different repos isn't a solution I'm willing to consider, as one is essentially useless without the other; go code is used to build the non-go bits, and go binaries are embedded within the resulting files (linux kernels with embedded initramfs).
[1] These non-go files are source for and outputs from buildroot and the linux kernel.
[2] github.com/purecloudlabs/gprovision; note that the changes with respect to GOPATH are not in that repo.
Appreciate this is closed - and it might be my convention - but I'm using a deployments
folder which contains infrastructure code (AWS CDK to be specific) for my Go project - this is an NodeJS library. Vendor
doesn't seem like the right folder for this - and unfortunately still NPM tooling doesn't allow us to change the path of node_modules
. Wondered peoples thoughts? I want to avoid having a root "go" dir - I could put the deployment code in Vendor but again feels like it breaks that standard... 😄
For those of you coming late to the thread, your simplest solution is a go.mod
file, as suggested in https://github.com/golang/go/issues/30058#issuecomment-543815369. In module mode, Go tools will completely ignore packages (directories) that belong to a different module.
If you're in GOPATH mode you can't use this solution, but most Go projects have already moved to modules by now.
@mvdan I have been using this workaround and it seems to be working fine within Go ecosystem. However JetBrains seems not to agree https://youtrack.jetbrains.com/issue/GO-9320
I use workaround with go.mod
file. But not empty (see JetBrains issue above). Here is my go.mod
module fake_go_module // is used to not try to find go files in node_modules
go 1.14
Any chance of reopening this? We have both bazel-* and node_modules in the working directory, it's really painful to run any of the go tools from there.
@ashi009 please clarify why an empty go.mod
file is not sufficient, as suggested in https://github.com/golang/go/issues/30058#issuecomment-543815369. Placing a go.mod
file should be very similar to placing a potential .goignore
file, and does not require extra features and documentation.
@mvdan bazel-*
is automatically generated by bazel, so I won't be able to ship such config as part of the devenv, and one must do that manually. node_modules is somewhat easier, as I could use some magic package.json thingy to add it, yet still hacky-ish.
I can blame the nature of monorepo for this, but given that GOPATH
is going away (we use to serve a synthetic GOPATH with FUSE), there aren't really many choices left for us.
Why is adding a .goignore
file easy, but adding a go.mod
file hard? Their location and effect would be the exact same in this context.
Why is adding a
.goignore
file easy, but adding ago.mod
file hard? Their location and effect would be the exact same in this context.
I'm not in favor of .goignore
in directory approach, but more in for https://github.com/golang/go/issues/16417. Guess I'm commenting on the wrong issue after all.
Was there ever a resolution or workaround for this issue? I.e. there's a folder like a cache or temporary folder that isn't meant to be built, but is within the path of a Go application.
@alexellis see https://github.com/golang/go/issues/30058#issuecomment-644128518. You can also prepend directory names with .
or _
to make the go tool skip them, see https://github.com/golang/go/issues/30058#issuecomment-459888562.
Having the same problem as @ashi009 with bazel-*
directories. Putting empty go.mod
everywhere is not something that can be replicated within different members of the team. Global ignore file in the root of the repo seems like something most users would know how to deal with. Bazel itself have .bazelignore
file too :)
Did anyone consider to use go.mod
for this purpose? What about adding a separate directive in go.mod that would allow to exclude certain directories for all the go tools?
Or maybe extend already existing exclude
directive to support file paths?
Most helpful comment
Directories can be excluded from module archives by placing empty go.mod in them AFAIK