_From @sdboyer on December 13, 2016 0:31_
GitHub thinks github.com/Sirupsen/logrus is the same as github.com/sirupsen/logrus. We need to make sure we handle this...correctly. Ish. Sort of. Ugh.
_Copied from original issue: sdboyer/gps#116_
I'll try to summarize the discussion that happened in Slack:
Sam detailed the following solution:
- the system that deduces repo/project roots from import paths needs to be expanded to return some kind of informational structure of case sensitivity for different elements of the paths
the entry point to this deduction system is here: https://github.com/sdboyer/gps/blob/master/source_manager.go#L429, which quickly passes off to https://github.com/sdboyer/gps/blob/master/deduce.go#L562
all (or at least almost all) of the work of root deduction is centralized in that deduce.go file
yes. for the full solution, the return type of that initial method would have to change from aProjectRoot(which is justtype ProjectRoot string) to something that also contains the case sensitivity info
- the system that manages repositories on disk needs to learn how to read those case sensitivity info structures, and we'll need a new trie implementation that works with them
the entry point to this is https://github.com/sdboyer/gps/blob/master/source.go#L61; it's responsible for creatingsourceGateway, which are the objects i mentioned that we're "very careful to ensure there's only one of, per repository" called by almost all of the otherSourceManagermethods in order to do their work
these methods are both fairly complicated b/c they have to block as little as possible under high concurrency, but also avoid duplicating work in a way that could result in duplicatesourceGateways being created
and all three of these https://github.com/sdboyer/gps/blob/master/source.go#L55-L57 are where the lookups actually happen - those keys would need to be made selectively case insensitive
and i don't even want to reach into the solver, because frankly it's everywhere
- the solver itself will need to stop using simple
map[string]<...>types, and use a custom map that, again, works with these case sensitivity info structures
but the risk and complexity of that is considerable, so we'll likely have to settle for some harm reductive approaches in the meantime
The following was also proposed:
assume that everything up to the root of the project is case _insensitive_, and that everything after it is case _sensitive_
so yeah, i'm now intrigued by the idea that, maybe, we just internally normalize to lower case
(for only the project root portion of an import path)
[...] i also think it would probably work _more_ often than the current system does
i think we don't actually touch the solver at all - if your codebase imports two paths that differ only in case, then we don't try to normalize that - we just (optionally) warn
all we _have_ to do is make sure that nothing blows up catastrophically when there are case-only differences in a repo and the user is on a case-insensitive filesystem, as with your bug
The issue I found with these solutions is that the solver will select 2 (most likely different) versions for each of github.com/Sirupsen/logrus and github.com/sirupsen/logrus. It also requires that we switch the source cache lock to be at the repository level, instead of a global lock.
The issue I see here is not orchestrating the source gateways but ensuring that the solver understands that github.com/Sirupsen/logrus and github.com/sirupsen/logrus are the same package.
I'm not sure if the compiler is case-sensitive wrt import paths, but I'd assume it relies on the file system for that.
We can consider the lower-case path the canonical import path for packages on Github (and other sources with the same issue) but the compiler will still expect to find 2 directories github.com/sirupsen/logrus and github.com/Sirupsen/logrus on the file system if it's case-sensitive and both are imported in the codebase.
This introduces another issue, ensuring that the package is vendored twice on case-sensitive file systems (for both cases).
An alternative is to just fail when a package is detected with 2 different paths (different letter-casing). This is not a solution, but it's the simplest thing we can do for the time being. 馃槥
aaaghhhh... yes, good point. we'd still need to record both/all casings in the lock, then determine at vendor write time whether the fs is case sensitive, and if it is, write them all instead of just the canonical lowercase one.
that potentially breaks other parts of the model, too - particularly related to vendor verification.
so yeah, the simplest thing right now probably is detecting the situation and erroring out earlier 馃槩
@sdboyer gps.ValidateParams (in #697) seems like a good place for a check like this, right?
@ibrasho hmm...they could probably rely on the same code, but this case is narrower, as we wouldn't be constructing a whole SolveParameters.
I just discovered some things:
go build errors out when both cases are present on a case-sensitive filesystem (ext4): XXXXX: case-insensitive import collision: "github.com/sixgill/service/vendor/github.com/Sirupsen/logrus" and "github.com/sixgill/service/vendor/github.com/sirupsen/logrus"
AFAICT this results in an impass: go won't build with multiple casings in your project code or vendor packages on a case sensitive filesystem, but you need both casings to build.
Correction: go build only errors out if there are imports with multiple casings.
We are using glide's aliasing feature to automatically rewrite imports in vendor using the old casing to the new one.
It's not clear how to solve this in dep without rewriting vendor package imports.
Anyone else have insights/experimental results?
What I did (not proud of it):
Added a gofmt to rewrite vendor packages:
vendor: $(BIN)/dep Gopkg.lock Gopkg.toml
$(BIN)/dep ensure
gofmt -w -r '"github.com/Sirupsen/logrus" -> "github.com/sirupsen/logrus"' $(shell find vendor -name \*.go -type f -print)
Edit: What we actually ended up with:
vendor: $(BIN)/dep Gopkg.lock Gopkg.toml
$(BIN)/dep ensure -v
# TODO(eric): kill it with fire once https://github.com/golang/dep/issues/433 is resolved
# Workaround for OSX sed behaving differently
case "${PLATFORM}" in \
Darwin) find vendor -type f -name "*.go" -print0 | xargs -0 sed -i '' 's/Sirupsen\/logrus/sirupsen\/logrus/g' ;;\
*) find vendor -type f -name "*.go" -print0 | xargs -0 sed -i 's/Sirupsen\/logrus/sirupsen\/logrus/g' ;;\
esac ;\
touch $@
thanks for the added info, @edrex!
It's not clear how to solve this in dep without rewriting vendor package imports.
馃槺 馃槺 馃槺 馃挜 馃挜 馃拃
go builderrors out when both cases are present on a case-sensitive filesystem (ext4):
honestly, i'm not even sure if this makes it worse or better. probably better, but it's still kind of horrifying. i need to ponder some more.
OK, I've mused on this now, and it's clear to me that there's just no way we can fix this without directly rewriting import paths. That's something we won't do. However, the current situation creates totally worthless errors - panics, even - and needs to be improved. That, we CAN do.
First, for context - confirming what @edrex discovered, the go toolchain disallows having two imports in a single program that differ only by case. While it's not actually in the spec, the toolchain disallowing it is good enough for me to treat it as a rule. To treat it as a rule, we need to enforce it in the solver itself. That means this is going to be a fairly tricky, or at least arduous, implementation.
Here's a quick sketch of what I think will be needed:
toFold() implementationcompleteDep or workingConstraint? - to track the difference between an original ProjectRoot and the folded nameThere's also some things to do in the SourceManager: as soon as a string or ProjectRoot enters one of the methods, we need to immediately fold it and put it into some kind of 2-tuple that stores both pieces, then store/key on that as appropriate. tbh I can't even fully think about all the places this will change things - but there are a lot.
Most helpful comment
What I did (not proud of it):
Added a
gofmtto rewrite vendor packages:Edit: What we actually ended up with: