Nixpkgs: Get rid of md5 support for fixed-output derivations

Created on 12 Oct 2014  Â·  49Comments  Â·  Source: NixOS/nixpkgs

We're in 2014 and even universities have a course where students forge md5 hashes of files.

Biggest usage of md5 hashes in nixpkgs is python, followed by libreoffice (scripted install).

$ git grep "md5 ="|grep -v libreoffice | grep -v python-|grep -v redhat|grep -v suse|wc -l                                                                                                        
141

$ git grep "md5 =" pkgs/top-level/python-packages.nix | wc -l
273

Observations:

_nix-prefetch- should print out multiple hashes together with fetch functions supporting and verifying all of specified hashes_

Q/A:

_it is considered best practice to use it when that's what upstream provides_

That's a very bad security practice. It trades user security for few seconds of maintainer time.

TODO

enhancement blocker security sprintable python

Most helpful comment

I think deprecation is supposed to annoy people with out-of-tree Nix packages to finally update with some time window, instead of abrupt disappearance of functionality?

(I do not have any personal interest in NixPkgs fetch* MD5 support, though)

All 49 comments

This is not as trivial as i thought because nix-prefetch-url does not check if the downloaded archive is a valid archive. e.g. my update for cups is invalid because nix-prefetch-url just downloaded a HTML file instead of an tar.bz2. i will prepare a fix. At least for cups. Or should we revert commit ?

Fixing cups is OK.

:+1:

Can we broaden this to also deprecate sha1 for similar reasons? It's not as broken as md5, but it's also not considered particularly good when used for security purposes.

Perhaps fetchurl and friends could even display a warning when passed md5 or sha1s that tells you that the practice is deprecated and give you a sha256 to replace it with in the expression.

Can we broaden this to also deprecate sha1 for similar reasons? It's not as broken as md5, but it's also not considered particularly good when used for security purposes.

Unfortunately, some projects still publish signed lists of SHA1s and
some Nix package maintainers seem to use these lists.

Another problem is that we don't seem to support SHA512: Firefox has
official SHA1 list and SHA512 list.

Perhaps we could then include both, making the sha256 attribute mandatory? The sha1/md5 can be used to check against upstream lists, whereas the sha256 serves to give you certainty that what you're deploying is what you think it is.

Perhaps we could then include both, making the sha256 attribute mandatory? The sha1/md5 can be used to check against upstream lists, whereas the sha256 serves to give you certainty that what you're deploying is what you think it is.

For firefox-bin the whole feasibility of including all the languages
hinges on _not_ downloading all the packages but using Mozilla-provided
checksum lists.

I think forbidding sha1 is a huge overkill ATM. In most packages there's even no way of securely retrieving the hash/tarball (no signature or https at least).

_If I wanted to attack nixpkgs, I'd become a contributor, so I'd get commit access relatively easily, and then I could sneak in whatever forged hashes I needed (unlikely to be noticed if done well)._

ATM the largest amount of md5 usage is from the new texlive, as its list of tarball hashes is taken from upstream and they only provide md5.

However, the transformed fixed-output derivations use sha1 already, so md5 isn't really used by regular users.

ATM the largest amount of md5 usage is from the new texlive, as its list of tarball hashes is taken from upstream and they only provide md5.

However, the transformed fixed-output derivations use sha1 already, so md5 isn't really used by regular users.

We also have LibreOffice dependency list which is MD5-from-upstream.

It sounds like TeXLive is getting with the times: https://www.preining.info/blog/2016/01/tex-live-security-improvements/

@7c6f434c LibreOffice's main download page took me here: http://download.documentfoundation.org/libreoffice/stable/5.1.0/mac/x86_64/LibreOffice_5.1.0_MacOS_x86-64.dmg.mirrorlist which seems to provide a fine sha256.

Yeah, LibreOffice provides those now; note that they also serve the hashes over https URL.

@7c6f434c LibreOffice's main download page took me here: http://download.documentfoundation.org/libreoffice/stable/5.1.0/mac/x86_64/LibreOffice_5.1.0_MacOS_x86-64.dmg.mirrorlist which seems to provide a fine sha256.

It has around a metric ton of inner dependencies that are built in some
special ways inside LO build. They are listed in downloads.lst — with
md5 sums. And they are, ahem, quite numerous. Some of them have been
factored out, but I remember returning some of them back into the LO
build becuase of heavy breakage.

Has anyone raised the issue upstream to see if they can just switch that
process over to a more sensible hash?
On Mon, Feb 15, 2016 at 15:12 Michael Raskin [email protected]
wrote:

@7c6f434c LibreOffice's main download page took me here:
http://download.documentfoundation.org/libreoffice/stable/5.1.0/mac/x86_64/LibreOffice_5.1.0_MacOS_x86-64.dmg.mirrorlist
which seems to provide a fine sha256.

It has around a metric ton of inner dependencies that are built in some
special ways inside LO build. They are listed in downloads.lst — with
md5 sums. And they are, ahem, quite numerous. Some of them have been
factored out, but I remember returning some of them back into the LO
build becuase of heavy breakage.

—
Reply to this email directly or view it on GitHub
https://github.com/NixOS/nixpkgs/issues/4491#issuecomment-184368970.

FWIW: I submitted a (long-merged) PR reducing the # of md5s down from 273 to 13, and sha1s from ?? to 0. I implemented it by taking the current URLs, fetching them, comparing md5s, then updating to sha256 only if they updated.

The situation of non-python packages actually got worse (compared the the initial values):

$ git grep "md5 ="|grep -v libreoffice | grep -v python-|grep -v redhat|grep -v suse|wc -l
516

Edit: It seems that is mainly because of pkgs/games/steam/runtime-generated.nix.

texlive 2016 replaces md5 by sha512 (!) WIP: https://github.com/NixOS/nixpkgs/pull/16391

The attack is so simple we even use it in nixpkgs(!): https://github.com/NixOS/nixpkgs/blame/5b61d977012f10f8a54d9f42124a833cbcd4fbde/pkgs/applications/networking/browsers/chromium/update.nix#L103-L106

md5 will have to go in 17.03

$ git grep "md5 ="|grep -v libreoffice | grep -v python-|grep -v steam|wc -l
114

Filed a bug to libreoffice to move away from md5: https://bugs.documentfoundation.org/show_bug.cgi?id=102163

cc @garbas for

pkgs/development/tools/pypi2nix/deps.nix:    md5 = pipWhlHash;
pkgs/development/tools/pypi2nix/deps.nix:    md5 = setuptoolsWhlHash;
pkgs/development/tools/pypi2nix/deps.nix:    md5 = pipHash;
pkgs/development/tools/pypi2nix/deps.nix:    md5 = setuptoolsHash;
pkgs/development/tools/pypi2nix/deps.nix:    md5 = zcbuildoutHash;
pkgs/development/tools/pypi2nix/deps.nix:    md5 = zcrecipeeggHash;
pkgs/development/tools/pypi2nix/deps.nix:    md5 = buildoutrequirementsHash;
pkgs/development/tools/pypi2nix/deps.nix:    md5 = wheelHash;
pkgs/development/tools/pypi2nix/deps.nix:    md5 = clickHash;
pkgs/development/tools/pypi2nix/deps.nix:  #   md5 = sixHash;
pkgs/development/tools/pypi2nix/deps.nix:  #   md5 = attrsHash;
pkgs/development/tools/pypi2nix/deps.nix:  #   md5 = effectHash;
pkgs/development/tools/pypi2nix/deps.nix:    md5 = requestsHash;

@domenkozar will fix this with next release of pypi2nix. should be out there shortly after this weekend.

Thanks to @chris-martin for #18802, LibreOffice inner dependencies are no longer a problem.

$ git grep "md5 ="| wc -l
91

@domenkozar Just in case: LibreOffice will keep having md5 = in the expression, just because LO build system assumes some tarballs include MD5 hashes in their names and some don't, so it is simpler just to create both symlinks for all of the tarballs. MD5 hashes are only used for verification during the package update process, though.

@7c6f434c roger.

$ git grep "md5 ="|grep -v libreoffice|wc -l
75

I apply for more exemptions.

  1. I have fixed the only non-obsolete voluntary mention of md5 in documentation (beam uses MD5 in the output independently of our uses). Should we just ignore the mentions of md5 in doc/old/cross.txt and grep inside pkgs/?
  2. lispPackages has md5 package. The same happens with Chicken Scheme egg list. Also, Haskell has apache-md5.
  3. Is it a good idea to start printing deprecation warnings from fetchurl when md5 is used? Maybe fetchgit should just drop md5 support and gain sha512 support instead…

Just a notice for everyone: #19334 is a PR to issue trace deprecation warnings whenever md5 is used in NixPkgs in fetch*. I will merge it myself next week unless there are objections or suggestions of large changes.

@domenkozar md5 -> sha256 fix for pypi2nix is already in master

Deprecation logic is also in master, now we just need changelog and fix for ~65 packages and this is done.

And to have the reminder everywhere — when MD5 is gone, the exception in make-tarball.nix (MD5 warnings are currently allowed, any other trace output breaks Hydra evaluation) should be removed.

Remaining:

pkgs/applications/office/libreoffice/still.nix:    third_party = [ (let md5 = "185d60944ea767075d27247c3162b3bc"; in fetchurl rec {
pkgs/build-support/vm/windows/cygwin-iso/mkclosure.py:        url, size, md5 = install_line.split(' ', 2)
pkgs/build-support/vm/windows/cygwin-iso/mkclosure.py:            '    md5 = "{0}";'.format(md5),
pkgs/development/haskell-modules/configuration-common.nix:  apache-md5 = dontCheck super.apache-md5;              # http://hydra.cryp.to/build/498709/nixlog/1/raw
pkgs/development/haskell-modules/configuration-hackage2nix.yaml:  - cryptohash-md5 ==0.11.100.1
pkgs/development/lisp-modules/lisp-packages.nix:  md5 = buildLispPackage rec {
pkgs/development/tools/egg2nix/chicken-eggs.nix:  md5 = eggDerivation {

This is just a regular reminder that libreoffice-still uses md5 for a content addressed URL (there is a sha256 of the content in the expression, too, and MD5 hash is a part of the URL), and Haskell, Common Lisp and Chicken find it convenient to call packages implementing MD5 md5.

As for cygwin-iso, @aszlig is it supposed to be still usable?

I have done something about it, as the new setup.ini uses SHA512 anyway, but I cannot build it and I am not sure if I am calling it right. I am sure my changes are an improvement, though.

This means that all remaining MD5 mentions in master are false positives.

We still need to implement deprecation (reverting https://github.com/NixOS/nixpkgs/commit/2ca883338389b7ab995924a0cab0211993bdf1da)

Well, technically you want to revert the removal of warnings.

I think 2ca8833 has been effectively reverted in e3a873479eee4e19a852011d45b0fb653f6c9e89

Is the plan now to apply 717ff85b14074c1a01b6ce7275e6dc727094a05a once more?

@domenkozar do you have better ideas?
@edolstra would you object to such a plan?

What if we just deleted support for specifying md5 =?

I think deprecation is supposed to annoy people with out-of-tree Nix packages to finally update with some time window, instead of abrupt disappearance of functionality?

(I do not have any personal interest in NixPkgs fetch* MD5 support, though)

We're going to need a new one of these for sha1...

So signing git commits is useless now? EDIT: I just started doing so a couple months ago...

Not fully useless (it's collision not preimage), and the attack is still pretty expensive, but I expect cost to go down significantly soon given how much relies on SHA1 nowadays.

Well, actually when we sign git commits, we certify rather just the changes and not the whole history. (Noone's verifying signatures on ancestors AFAIK.)

On 3fdd726b16cf39da25dd97d30631dc129e33fb0f

Total count of "md5 =" in nixpkgs is huge.
~
$ git grep "md5 ="|wc -l
225
~

Libreoffice is top 1 project using md5.

~
$ git grep "md5 ="|grep libreoffice|wc -l
221
~

Projects without libreoffice

~~~
$ git grep "md5 =" |grep -v libreoffice|wc -l
4

$ git grep "md5 =" |grep -v libreoffice
pkgs/development/haskell-modules/configuration-common.nix: apache-md5 = dontCheck super.apache-md5; # http://hydra.cryp.to/build/498709/nixlog/1/raw
pkgs/development/haskell-modules/configuration-hackage2nix.yaml: - cryptohash-md5 ==0.11.100.1
pkgs/development/lisp-modules/lisp-packages.nix: md5 = buildLispPackage rec {
pkgs/development/tools/egg2nix/chicken-eggs.nix: md5 = eggDerivation {
~~~

None of them is a md5 hash. Good !

Should an issue for sha1 be opened ?

~~~
$ git grep "sha1 ="|wc -l
8839

$ git grep "sha256 ="|wc -l
39903
~~~

Node-packages are top 1 on sha1 ( https://github.com/svanderburg/node2nix/issues/39 )
~
$ git grep "sha1 ="|grep node-packages|wc -l
7562
~

Libreoffice uses sha256 to check the download:

(map (x : ((fetchurl {inherit (x) url sha256 name;}) // {inherit (x) md5name md5;})) (import ./libreoffice-srcs.nix));

Example entry:

  {
    name = "libabw-0.1.1.tar.bz2";
    url = "http://dev-www.libreoffice.org/src/libabw-0.1.1.tar.bz2";
    sha256 = "7a3d3415cf82ab9894f601d1b3057c4615060304d5279efdec6275e01b96a199";
    md5 = "7a3815b506d064313ba309617b6f5a0b";
    md5name = "7a3815b506d064313ba309617b6f5a0b-libabw-0.1.1.tar.bz2";
  }

I don't see any remaining packages using md5 so I'd propose removing the support in the fetchers on master, fixing this in projects outside of nixpkgs is easy and shouldn't be used anyway.

This bit me with Curseforge, where I'm generating nix code by scraping their download pages. They only provide MD5, and I don't want to download the files themselves; it'd take far too much bandwidth, and might get me banned.

Could you reconsider disabling md5, or maybe gate it behind a __YES_I_REALLY_WANT_THIS type flag? The only alternatives I can see are to fork fetchurl, or to disable hash checking entirely, and I'd prefer to do neither. I've tried asking them to provide a better hash, but it doesn't seem like that will happen anytime soon, especially as the service has no official API.

@Baughn Nix's builtin fetchurl still supports md5, so you may be able to do:

import <nix/fetchurl.nix> { url = ...; md5 = "7f8b9cc5953a21f2e0fa58547db7bb6c"; }
Was this page helpful?
0 / 5 - 0 ratings

Related issues

teto picture teto  Â·  3Comments

yawnt picture yawnt  Â·  3Comments

copumpkin picture copumpkin  Â·  3Comments

matthiasbeyer picture matthiasbeyer  Â·  3Comments

ob7 picture ob7  Â·  3Comments