Harbor: Quota of dockerhub is still used in v2.1.1 after the image is cached.

Created on 21 Nov 2020  路  6Comments  路  Source: goharbor/harbor

Was reported by Remoco Breekveldt from slack channel:

  1. Harbor Version v2.1.1-5f52168e with Proxy cache configured on example.com/docker-hub
  2. Enter the Harbor Core pod and check the remaing rate limit with the following commands:
export HEADER='Authorization: Bearer'
export TOKEN=$(curl "https://auth.docker.io/token?service=registry.docker.io&scope=repository:ratelimitpreview/test:pull" | sed 's/{"token":"//g' | sed 's/","access_token":.*//g') 
curl --head -H "$HEADER $TOKEN" https://registry-1.docker.io/v2/ratelimitpreview/test/manifests/latest 2>&1

Remaining-limit is showing 250

  1. From a local machine loop a pull for busybox using the following command (note: ratelimit is 250 today):
    for i in {1..255}; do docker pull example.com/docker-hub/library/busybox; done
  2. Enter the Harbor Core pod and check the remaing rate limit with the following commands:
export HEADER='Authorization: Bearer'
export TOKEN=$(curl "https://auth.docker.io/token?service=registry.docker.io&scope=repository:ratelimitpreview/test:pull" | sed 's/{"token":"//g' | sed 's/","access_token":.*//g') 
curl --head -H "$HEADER $TOKEN" https://registry-1.docker.io/v2/ratelimitpreview/test/manifests/latest 2>&1
  1. We noticed that we consumed all our rate-limit and actually the remaining limit is 'minus' 5
areproxy-cache targe2.2.0

Most helpful comment

this will be fixed in the v2.2 release, without having to cache non-matching OS images

All 6 comments

I have the same problem.

The test case in the OP has a problem, that busybox:latest is a manifest list, not a manifest.

If you test with a container image, such as goharbor/harbor-core:v2.1.1, you will see that the quota is used only once.

By design, the index.json in Harbor may be different from that in the remote registry because the proxy cache may proxy a part of the manifest list. So for manifest lists, it may happen that the proxy cache will pull index.json again and again.

To fix this issue we may need to pull all the referenced artifacts of a manifest list to Harbor even user only wants one image in the index. This put us in a dilemma that to cache all items of a manifest list require Harbor to pull more than necessary manifests which may drain the quota faster.

unfortunately everything on dockerhub is a manifest list. Caching everything is not a robust solution as storage would blow up way too fast. What we need to do is compare the digests of the matching OS image only; whether ARM redis has been updated is irrelevant to me if I'm on centos.

Thanks to the users who brought this to our attention. investigating a fix now

We'll try to cache the complete manifest-list in another place (probably redis) and compare the digest and the complete manifest list in UseLocal method, meanwhile we need to make sure the current flow to write the manifest list (index.json) remain unchanged, because the index.json is used in other artifacts like CNAB.

I want to highlight making this work for docker images without having to store unnecessary images is higher priority than supporting CNAB.

this will be fixed in the v2.2 release, without having to cache non-matching OS images

Was this page helpful?
0 / 5 - 0 ratings