Aws-cdk: Errors using ZipFile in the BucketDeployment Construct

Created on 21 Mar 2020 · 11Comments · Source: aws/aws-cdk

🐛 Bug Report

We use the BucketDeployment construct to perform deployments of our static assets for a website that is then distributed using CloudFront.

While using this, we have received a couple of different transient errors that would block deployments. These errors have either been one of the following (with ABC and XYZ being file names we are trying to deploy).

File name in directory ABC and header XYZ differ
Bad CRC-32 for file ACB

Once these occurred, retrying the same deployment didn't ever work and the only thing that has seemed to fix it is simply doing a code change to the resources in the deployment and then doing an entirely new deployment.

Based on the logs of the Lambda that runs this construct (see below), I was able to trace the error down to this line of code.

Based on a quick search of those errors, this seems to be a relatively common occurrence within Python for that library

File name in directory ABC and header XYZ differ
Bad CRC-32 for file ACB
- StackOverflow post 1
- StackOverflow post 2

I'm not exactly sure what the remedy for this would be given its been transient and hard to reproduce, but we've run into this issue enough times that I've figured its worth surfacing in case others run into it as well.

Some of the StackOverflow posts basically ended up not using the extractAll method and they implemented their own, but not sure if those all apply in this case.

Error Log

Here is an example of the logs from one class of failed deployment I've seen (File name in directory ABC and header XYZ differ).

[ERROR] 2020-03-12T05:09:00.129Z    8339cb69-b6bc-4a9d-8a4e-7b611b7e242e    File name in directory 'static/js/2.7eee25fe.chunk.js.LICENSE' and header b'static/js/2.7eee25fe.chunk.js.LIC\xc4\xbdoS' differ.
Traceback (most recent call last):
  File "/var/task/index.py", line 101, in handler
    s3_deploy(s3_source_zips, s3_dest, user_metadata, system_metadata)
  File "/var/task/index.py", line 131, in s3_deploy
    zip.extractall(contents_dir)
  File "/var/lang/lib/python3.6/zipfile.py", line 1524, in extractall
    self._extract_member(zipinfo, path, pwd)
  File "/var/lang/lib/python3.6/zipfile.py", line 1577, in _extract_member
    with self.open(member, pwd=pwd) as source, \
  File "/var/lang/lib/python3.6/zipfile.py", line 1419, in open
    % (zinfo.orig_filename, fname))

Here is an example of the logs from the other class of failed deployment I've seen (Bad CRC-32 for file ABC).

[ERROR] 2020-03-21T21:24:12.561Z    97098b50-04dc-4bf4-ab43-b89873817924    Bad CRC-32 for file 'static/js/runtime-main.664eb0cb.js'
Traceback (most recent call last):
  File "/var/task/index.py", line 101, in handler
    s3_deploy(s3_source_zips, s3_dest, user_metadata, system_metadata)
  File "/var/task/index.py", line 131, in s3_deploy
    zip.extractall(contents_dir)
  File "/var/lang/lib/python3.6/zipfile.py", line 1524, in extractall
    self._extract_member(zipinfo, path, pwd)
  File "/var/lang/lib/python3.6/zipfile.py", line 1579, in _extract_member
    shutil.copyfileobj(source, target)
  File "/var/lang/lib/python3.6/shutil.py", line 79, in copyfileobj
    buf = fsrc.read(length)
  File "/var/lang/lib/python3.6/zipfile.py", line 872, in read
    data = self._read1(n)
  File "/var/lang/lib/python3.6/zipfile.py", line 962, in _read1
    self._update_crc(data)
  File "/var/lang/lib/python3.6/zipfile.py", line 890, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
  zipfile.BadZipFile: Bad CRC-32 for file 'static/js/runtime-main.664eb0cb.js'

I can dug up more examples if need be.

Environment

CDK CLI Version: 1.23.0
Module Version: 1.23.0
OS: macOS Mojave

- Language: TypeScript (but likely all languages)

This is :bug: Bug Report

@aws-cdaws-s3 bug p2

Source

rhermes62

👍5

Most helpful comment

@rhermes62 No I am not renaming the zip. I am creating the zip like so and passing it to BucketDeployment:

const sourcePath = path.join(__dirname, 'path-to-source-dir');
const destinationZip = path.join(__dirname, 'dummyDir/temp.zip');
childProcess.execSync(`zip -r ${destinationZip} *`, { cwd: sourcePath });

new s3deploy.BucketDeployment(this, 'some-name', {
      sources: [s3deploy.Source.asset(destinationZip)],
      destinationBucket: destBucket
});

ayush987goyal on 9 Apr 2020

🚀1 👍1

All 11 comments

Hi @rhermes62 Thanks for reporting this.

I wonder, what is the size of the directory you try to deploy?

Looks like the uploaded asset is somehow corrupted. Since you say its sporadic, i'm leaning towards attributing this to network glitches. Most likely during upload time since the bucket deployment lambda traffic is purely inside amazon network.

I am going to dig a bit deeper here though.

iliapolo on 29 Mar 2020

We have observed the same issue. Is there any update on it regarding resolution and/or workaround?

ayush987goyal on 5 Apr 2020

I wonder, what is the size of the directory you try to deploy?

It is 1.9mb (it is the contents for a static website). Although it's possible the static assets get corrupted, I doubt it's the case. The assets are coming from a CI/CD system, and I get the errors even after retrying multiple times. I also manually checked the assets, and they are as we expect.

After having dealt with this more, I have a hunch that the BucketDeployment doesn't handle when the same assets are re-deployed. In my original post, I had commented this:

...the only thing that has seemed to fix it is simply doing a code change to the resources in the deployment and then doing an entirely new deployment.

I have found that just going manual code changes in the assets which triggers a rename has fixed the deployments. We should probably go and test to see if we can get something reproducible based on this trend I've found.

This all might be a red herring but briefly chatted with @ayush987goyal and his set up is deploying files with the same names which adds to the "same file name bug" theory.

rhermes62 on 6 Apr 2020

👍2

This all might be a red herring but briefly chatted with @ayush987goyal and his set up is deploying files with the same names which adds to the "same file name bug" theory.

@rhermes62 I can confirm, updating asset file names by making any code change ensures a successful deployment. Any other change triggering the deployment with same assets usually fails.

nitishdhar on 7 Apr 2020

👍1

Whelp, with 3 of us now having issues with essentially "redeployments" of the same file name, we need to figure out why this line breaks when dealing with files of the same name and, better yet, how to fix it🤔

rhermes62 on 7 Apr 2020

👍1

One workaround I found out was to create the zip ourselves and provide that as an Asset instead of providing path to a directory and let CDK create the zip. Not exactly sure why doing this is different then the actual.

ayush987goyal on 8 Apr 2020

@ayush987goyal with that workaround, do you have to manually rename the zip file each time? Or does the construct work with the same name each time? (I am also going to try that so we don't get blocked deployments)

rhermes62 on 8 Apr 2020

Seeing the same issue, more than 50% of the time.

melnikalex on 9 Apr 2020

To be more specific, I have a construct that uploads a directory to s3. I ran cdk synth on the directory a handful of times and the asset.<sha>.zip file produced in cdk.out/ is corrupt about 50% of the time. Same zipfile name, but actually different binaries. I can see the mismatching "local" filename error message coming directly from unzip on my mac.

melnikalex on 9 Apr 2020

@rhermes62 No I am not renaming the zip. I am creating the zip like so and passing it to BucketDeployment:

const sourcePath = path.join(__dirname, 'path-to-source-dir');
const destinationZip = path.join(__dirname, 'dummyDir/temp.zip');
childProcess.execSync(`zip -r ${destinationZip} *`, { cwd: sourcePath });

new s3deploy.BucketDeployment(this, 'some-name', {
      sources: [s3deploy.Source.asset(destinationZip)],
      destinationBucket: destBucket
});

ayush987goyal on 9 Apr 2020

🚀1 👍1

This all might be a red herring but briefly chatted with @ayush987goyal and his set up is deploying files with the same names which adds to the "same file name bug" theory.

@rhermes62 I can confirm, updating asset file names by making any code change ensures a successful deployment. Any other change triggering the deployment with same assets usually fails.

Looks like it can fail in the first deployment as well. I did a clean deployment in a new bucket and a new AWS account altogether and still got the File name in directory ABC and header XYZ differ error

nitishdhar on 11 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings