Pkg.jl: Dependency confusion between internal registries and General

Created on 10 Feb 2021  路  11Comments  路  Source: JuliaLang/Pkg.jl

A recent, novel supply chain attack on some package managers is also possible in certain Pkg/Registry configurations.

The gist of it is that some package managers, when given a package name, by default look in "internal" repos first, then also check the "public" repos and install whichever returns a higher version number. For this attack to be successful in Pkg, the attacker would also have to know the UUID of the internal package and register a package in General with both the same name and the same UUID, but a higher version number (e.g. 9001.0.0). Once registered, Pkg installs whichever version is higher, thereby allowing "shadowing" of the internal package with a malicious package.

A MWE can be found at https://github.com/DilumAluthge/MWE_multiple_registries_same_package_uuid.

The intention for all possible fixes is to preserve the ability to have multiple registries available to provide the same package. This should not allow attackers to intentionally register packages with the same name & UUID as another package in a different registry and mislead people into downloading their malicious package.


A non-breaking fix is for each private registry user who also uses General to use the 3 day waiting period to monitor for clashes in new package registrations to General. This should be automatable with some tooling, which comments on the PR to General and thus stops the automerge. As a precaution, private registry users may want to create new UUIDs for their internal packages and investigate how the UUID leaked in the first place.

Another non-breaking fix on a registry-per-registry basis would be to mirror & vet General manually, though this is somewhat high maintenance and thus unlikely to be useful in practice. This would also require some investigation into how the UUID leaked, should a mismatch be detected.

A possible long-term fix would involve a new shadowable entry in Package.toml, which would be opt-in and signal that the package is allowed to come from other registries as well. In this model, all installed registries that have the same combination of (name, UUID) would also need to have shadowable=true set for that package. If any registry doesn't have this set, we error.

This would be a breaking change, so our options are:

  1. Do it in Julia 2.0.
  2. Do it in Julia 1.x, but make a lot of Slack posts and Discourse posts informing people of the change, and make it easy for people (JC, Invenia, Beacon, etc) to make the changes needed. This would only affect local/internal registries that have shared a package with other registries (e.g. by open sourcing them to General). We should work closely with those who are known to have opensourced packages to make their transition as easy as possible.
registries security

Most helpful comment

Many thanks to @ericphanson and @DilumAluthge for brain storming both short- and long-term fixes to this!

All 11 comments

Many thanks to @ericphanson and @DilumAluthge for brain storming both short- and long-term fixes to this!

Another complication is the case when two intentionally public registries (e.g. General and HolyLabRegistry) are able to shadow each other. This could be prevented by specifying which registries exactly are allowed to shadow the package (or rather, which packages are allowed to be trusted for a specific package). These lists would have to agree on all registries, otherwise we error.

I'm not 100% sure about this, because it requires all registries to be updated & maintained at roughly the same time. Kind of feels like a mismatch is bound to happen eventually here :/

I think they could just both put shadowable=true, and then opt-out of this safety check. I don't think that case is really an issue because both packages are controlled by the same maintainers.

I think the security issue is only when PkgX is in RegistryA but not RegistryB, and users of PkgX use both registries, and an adversary has the ability to register a package with the same name and UUID in RegistryB. I think for most practical purposes the only possible RegistryB is General, since it has publicly-available automerge and is installed by default on all Julia installations.

Therefore, if PkgX is already in General, there isn't really a security issue and one can just use the opt-out.

edit: deleted paragraph saying exactly the same thing as you @seelengrab about using a list of registries :). I don't think that's really needed at this point and adds to the burden like you said.

The issue of shadowing packages from other public registries with a new package in General could also be mitigated by having auto-merge block registration of UUIDs that exist in public registries it has been configured to know about.

A possible long-term fix would involve a new shadowable entry in Package.toml, which would be opt-in and signal that the package is allowed to come from other registries as well. In this model, all installed registries that have the same combination of (name, UUID) would also need to have shadowable=true set for that package. If any registry doesn't have this set, we error.

A non-breaking variation of this would be to only disallow merging if Registry.toml contains a field with that meaning, and have an option to allow merging on a package basis by a mergable entry in Package.toml. Possibly this could also specify exactly which registries to allow merging with. Thus General would not make use of this feature and be completely unaffected but you could set it in your private registry to make sure that your packages cannot be shadowed by General. And in case you do make some of your packages public, there's an override mechanism to allow those to be merged with General.

I had thought through letting private registries publish a salted, hashed list of UUIDs that could be checked for collisions, but there's a bit of a problem with that: how do you distinguish the original author taking their own package open source from someone else trying to hijack their private package UUID? One answer could be that we examine the situation manually and make a judgement. Otherwise it seems like there needs to be a way of proving that you where the party that submitted the original salted and hashed entry, which gets into tricky crypto territory. Not impossible, but not simple.

Moreover, once package authors need to have proof that they "own" a UUID 鈥斅爄n the above scenario, just to be able to take it public at some point 鈥斅爐hen why bother with the rest? If authors have a private key, they can just sign each release's hash and those signatures can be checked with the private key that's in the registry. If the public keys in different registries don't match, then the client can refuse to install.

You would want a way for authorities that you trust to sign versions with other keys so that your local admins can publish hotfix versions of public packages, but that can also be arranged.

In the public version of this that is implemented in AutoMerge, the rule is to allow registration of a protected UUID if name and repo matches, on the assumption that you can't effectively hijack a package if you still have to point it to the original author's repo.

Wasn't the design for the hashed version similar in that respect?

Right, but what if you need to change the URL? It's all nice in theory to think that URLs are forever but we know that in reality they are not. If a new hashed record is submitted, how do we decide whether it's ok to let it replace the old one?

Fundamentally we have the same problem with repo changes of packages that have already been registered in General, except of course that there is more data to make a judgement call from. The easy way out, with its own problems, is to require someone who has protected a UUID in this way and want to open source the package, to do that with a new UUID.

That's a bitter pill to swallow given that the current system has made it so smooth and easy for people to open source their private packages and keep using them without issues. And that's something we really want to encourage.

We don't have to do something mechanical and rigid here: if an organization has previously published a hashed list of UUIDs, is taking something open source and wants to change the URL, we can always evaluated it using human judgement 鈥斅爐he only thing that needs to be automatic is the rejection of attack attempts. But that approach does mean that we need to be able to identify what's going on with the hash lists in order to be able to make judgements about whether a URL change should be allowed or not.


Unrelated, but here's my high-level thinking about this problem. Fundamentally this is about who each person trusts to publish new versions of various packages. One generally trusts the original author, so when they make a new release, we're happy to upgrade to it. We also sometimes want to trust some other entity like our own organization's sysadmins to publish new versions of packages they don't maintain for hotfixes and the like. But that should be a conscious choice on the part of the user or a preconfigured policy on corporate machines.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

DilumAluthge picture DilumAluthge  路  3Comments

cscherrer picture cscherrer  路  3Comments

oxinabox picture oxinabox  路  3Comments

cossio picture cossio  路  3Comments

dpsanders picture dpsanders  路  3Comments