Conan: [bug] Performance of unzip

Created on 12 Mar 2020  路  7Comments  路  Source: conan-io/conan

Environment Details (include every applicable attribute)

  • Irrelevant
  • Irrelevant:
  • 1.22.0:
  • 2.7 and 3.7:

Steps to reproduce (Include if Applicable)

Try building this recipe: https://github.com/kmaragon/conan-aws-sdk-cpp/blob/master/conanfile.py but set no_copy_source to true

Notice that unzipping the aws-archive is suuuuper slow.

triaging

All 7 comments

I guess that you are running in Windows. In Linux, it has reasonable performance. So the OS is not irrelevant. Could you please confirm?

Please also do update to the latest patch version, in this case 1.22.3, try to avoid older patches for every major.minor release.

Just to add some details. The .tar.gz file is like 22Mb, it unzips to 351Mb in my disk, 42,826 files. Certainly the compression ratio for this one is really big, and the unzipping is costly, especially in Windows (no surprise here).

All Conan does for a tar.gz file is:

 with tarfile.TarFile.open(filename, 'r:*') as tarredgzippedFile:
        if not pattern:
            tarredgzippedFile.extractall(destination)

Yes, it is slow, but the file is challenging, that is the best python standard library can do to unzip that file.
I have also done

tar -xvf <downloaded file>

And it is very slow too, it seems the terminal has hanged.

So it seems this is not a bug, just that .tar.gz of that aws-sdk package is challenging and slow to unzip.

I guess sometimes Git is better than unzip, on top of being less complex to use

I guess sometimes Git is better than unzip, on top of being less complex to use

Yes, that is totally possible. Users can use whatever is best for their specific use case. In this case, if that particular recipe could be improved, then there are 2 possible lines of action:

  • Report in https://github.com/kmaragon/conan-aws-sdk-cpp, and suggest there to change the way to retrieve the sources
  • If that recipe is being contributed to ConanCenter, ask for this in https://github.com/conan-io/conan-center-index. This might not be followed, because in ConanCenter recipes, consistency with other recipes might be more important than performance, but maybe if performance is so bad, the community there will support this approach for this one.

Thanks for the feedback!

What I decided for now was to write my own recipe that uses Git. I suppose I should not submit it to the conan index.

Thanks for taking the time to answer my question!

What I decided for now was to write my own recipe that uses Git. I suppose I should not submit it to the conan index.

Oh yes, why not, if aws-sdk is not there in ConanCenter yet, I would recommend submitting it. ConanCenter and the new conan-center-index repo is still Early Access Program EAP and we are still learning and working with the community to establish best practices. You can submit it using Git as source, and it will be reviewed. It is possible that some of the automated checks will kick-in saying that conandata.yml should be used, but that can be responded, showing and comparing the times of the alternatives. It might be accepted or not, but nothing is written in stone.

Oh, then I'll consider doing it when I get some time!

Was this page helpful?
0 / 5 - 0 ratings