Conan: [bug] conan tries to extract broken downloads

Created on 1 Mar 2021  路  5Comments  路  Source: conan-io/conan

When CTRL + C while conan downloads the prebuilt archives of a package, the subsequent execution of conan install tries to extract from those half-downloaded archives from the download_cache. This does ONLY happens when you use a download_cache.

2 possible ideas:

  • add a hash to check if the downloaded archives are not corrupt
  • restart the download when the extraction fails

Environment Details (include every applicable attribute)

  • Operating System+version: Ubuntu 18.04 WSL2
  • Compiler+version: gcc 7.4
  • Conan version: 1.33.1
  • Python version: 3.6.9

relevant conan settings: download_cache = /tmp/conan_download_cache

Steps to reproduce (Include if Applicable)

Its easy to replicate with a "big" package, maybe something like opencv, that is available as prebuilt on the remote of your choice. If not, use your own conan repo, build and upload your package.

  1. conan install opencv/4.5.0@ and hit CTRL + C when conan says: Downloading 834c114e736468c70cf2b5d0781c6c8db5787764b30d718ac40d45e51ec34c0d: 23%|##2 | 2.64M/11.7M
  2. rerun conan install opencv/4.5.0@ and see:
Downloading binary packages in 8 parallel threads
opencv/4.5.0: Retrieving package 05ae15556d4c5439324a478fdaea5d0109fda60e from remote 'luminar' 
Decompressing conan_package.tgz:  61%|######    | 3.15M/5.18M [00:00<00:00, 33.1MB/s]opencv/4.5.0: ERROR: Exception while getting package: 05ae15556d4c5439324a478fdaea5d0109fda60e
opencv/4.5.0: ERROR: Exception: <class 'conans.errors.ConanException'> Error while downloading/extracting files to /home/fberchtold/.conan/data/opencv/4.5.0/_/_/package/05ae15556d4c5439324a478fdaea5d0109fda60e
Compressed file ended before the end-of-stream marker was reached
Folder removed
ERROR: Error while downloading/extracting files to /home/fberchtold/.conan/data/opencv/4.5.0/_/_/package/05ae15556d4c5439324a478fdaea5d0109fda60e
Compressed file ended before the end-of-stream marker was reached
Folder removed
bug

Most helpful comment

Hi @blackliner

Yes, I have realized too that when using revisions, the checksum is simply not there for Conan cached artifacts, as it relies on the revision. I am working on a fix using the "dirty" functionality we are using in other places.

All 5 comments

I would definitely consider this a bug.

Hi @blackliner

Yes, this is a bit surprising. The implementation already contains a checksum:

        with self._lock(h):
            cached_path = os.path.join(self._cache_folder, h)
            if not os.path.exists(cached_path):
                self._file_downloader.download(url=url, file_path=cached_path, md5=md5,
                                               sha1=sha1, sha256=sha256, **kwargs)
            else:
                # specific check for corrupted cached files, will raise, but do nothing more
                # user can report it or "rm -rf cache_folder/path/to/file"
                try:
                    check_checksum(cached_path, md5, sha1, sha256)
                except ConanException as e:
                    raise ConanException("%s\nCached downloaded file corrupted: %s"
                                         % (str(e), cached_path))

The check_checksum() should given at least a nicer error message, and not wait until extraction time. I am not sure why this doesn't raise, need to check.

@memsharded any updates on this issue? Is it scheduled to be fixed?

I added some print, and it looks like the method is called without any md5, sha1 or sha256

Running checksum with:
{'md5': None, 'sha1': None, 'sha256': None}

There is a note in that same file (conans/client/downloaders/cached_file_downloader.py) at the _get_hash method:

For Api V2, the cached downloads always have recipe and package REVISIONS in the URL,
making them immutable, and perfect for cached downloads of artifacts. For V2 checksum
will always be None.
For ApiV1, the checksum is obtained from the server via "get_snapshot()" methods, but
the URL in the apiV1 contains the signature=xxx for signed urls, but that can change,
so better strip it from the URL before the hash

Hi @blackliner

Yes, I have realized too that when using revisions, the checksum is simply not there for Conan cached artifacts, as it relies on the revision. I am working on a fix using the "dirty" functionality we are using in other places.

Fixed in https://github.com/conan-io/conan/pull/8664, will be released in 1.35

Was this page helpful?
0 / 5 - 0 ratings