Azure-docs: AKS-ACR unauthorized: authentication required

Created on 18 Jul 2019  ·  38Comments  ·  Source: MicrosoftDocs/azure-docs

I'm having issues with a pod in AKS pulling image from ACR.
I tried the steps documented here https://docs.microsoft.com/en-us/azure/container-registry/container-registry-auth-aks#grant-aks-access-to-acr but no luck.

The error I received is :

_Failed to pull image "test.azurecr.io/q/p:01": rpc error: code = Unknown desc = Error response from daemon: Get https://test.azurecr.io/v2/q/p/manifests/01: unauthorized: authentication required_

I'm on AKS 1.14.0 and trying this on Windows Node Pool.

What am I missing here?


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

Pri3 container-servicsvc cxp in-progress product-question triaged

Most helpful comment

OK. Found my error: The service principal had expired 🤦‍♂ (I wasn't aware/paying attention of the one year expiration time). Now that I know, I can also see it from the logs on the AKS resource in the Azure Portal.

To fix it, I followed the instructions here https://docs.microsoft.com/en-us/azure/aks/update-credentials

Sorry about the noise, and hope this may help someone else landing on this page.

All 38 comments

Thanks for the feedback! We are currently investigating and will update you shortly.

@sameer-kumar are you having any issues logging into your ACR or is it just with pulling images?

Hi @MicahMcKittrick-MSFT,

I'm having the same problem. This has been working until today without issues and now it seems the service principal that's configured on our AKS has dropped it's authentication for some reason and can no longer pull images from our ACR

I can login and push images to the ACR, our AKS's service principal seems to be the problem

@MicahMcKittrick-MSFT same with me. I verified that I can log in to ACR using admin credentials as well as with a custom SPN creds from a windows node in the same subnet as AKS cluster. However, AKS cluster SPN is unable to authenticate and hence can't pull images.

I have the same issue, I am using a service principle for authentication, error can be reproduced with the following script:

ACR_ID=$(az acr show -n $myacr -g $myresourcegroup --query "id" -o tsv)
registryPassword=$(az ad sp create-for-rbac -n $myacr-push --scopes $ACR_ID --role acrpush --query password -o tsv)
registryUsername=$(az ad sp show --id http://$myacr-push --query appId -o tsv)

docker login --username $registryUsername --password $registryPassword "$myacr.azurecr.io"

Gives the following response

Error response from daemon: Get https://$myacr.azurecr.io/v2/: unauthorized: authentication required

UPDATE: After running this about 5 times it randomly seems to work. Another thing I noticed is that after creating a SP it takes about 30-60s before it become active. So you can't immediately use if for instance in a deployment script.

Hey guys, managed to get this working. 1) confirm your aks spn has acrpull role on your acr. 2) confirm the repo/image really exists on your acr, I had this error and the repo/image didn't exist

Thanks for the update all! Apologies for the delay in response. I was out for the weekend.

Appears the issue is no longer present. Is that correct?

Issue still exists. Same error with AKS spn. However, no error with custom
spn.

On Mon, Jul 22, 2019, 6:35 PM Micah notifications@github.com wrote:

Thanks for the update all! Apologies for the delay in response. I was out
for the weekend.

Appears the issue is no longer present. Is that correct?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MicrosoftDocs/azure-docs/issues/35411?email_source=notifications&email_token=AA35XPIPGZ5K23GDPZ6UUY3QAYY3DA5CNFSM4IE4BAF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2RLROA#issuecomment-513980600,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA35XPPINNJLO645BF76TGDQAYY3DANCNFSM4IE4BAFQ
.

For now I am not having any issues, however the bug in which you get an unauthorized response whenever your image doesn't exist is kind of annoying. It makes debugging things quite difficult.

I am not seeing any issues at this time.

For anyone that is, feel free to reach out to me via [email protected] and provide me with your Azure SubscriptionID and link to this issue. I can then enable you for a free support request to have this looked into further.

I'm able to access acr from aks if I do kubectl apply after following the guide, but if I do a kubectl set image to update the image, it returns unauthorized when acrpull like what was mentioned above.

@MicahMcKittrick-MSFT any idea on it?

@sameer-kumar I am not seeing anything in this doc related to kubectl set image. Are you following another doc by chance?

the doc itself doesn't include set image, I just do need to run the set image command to update image, and not sure why it doesn't work since my kubenetes cluster should already have access to pull acr.

What is the command you are using? It should be something like

kubectl set image deployment azure-vote-front azure-vote-front=<acrLoginServer>/azure-vote-front:v2

@MicahMcKittrick-MSFT Yes, this is exactly the command I used. I just want to update the docker image being deployed from say v1 to v2, but it says unauthorized. The first deploy with kubectl apply works well for me.

@sameer-kumar try this doc to see if you can do all the steps

https://docs.microsoft.com/en-us/azure/aks/tutorial-kubernetes-app-update

If not, freel free to email me at [email protected] as well with your SubscriptionID and link to this thread and I can get you in contact with Technical Support.

@MicahMcKittrick-MSFT that's exactly the doc I reference to.

@sameer-kumar go ahead and shoot me that email with the details and we can get it sorted out :)

Hi,
do you have a solution for this issue, it is happening to me, too.
I'm using the aks preview feature with windows node pool.
Using AKS 1.14.8 with a private Azure container registry,
the kubernetes pod is not able to pull the image, " unauthorized: authentication required".

Please, if there is another thread to follow, could you point me to it?
Cheers.

Specify your cabernets secret name in the yaml configuration:

e.g:
apiVersion: v1
kind: Pod
metadata:
name: private-reg
spec:
containers:

  • name: private-reg-container
    image:
    imagePullSecrets:
  • name: regcred

See reference URL:

https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/

                                             /\
                                              |

Specify you container secret ---

Using docker pull <image/tag> works just fine after running az login and then sudo az acr login --name <az container reg name>, but if i use docker-compose (which is kind of needed for a multi container app) I get this error.

This is only non functional so far on our new testing/staging environment (an Azure hosted Ubuntu VM). It works fine from my windows pc and other developers workstations (win, osx), and other environments (an AWS hosted Ubuntu vm that is the current testing/staging environment) can docker-compose pull images from the same container registry using the same docker-compose.yml file.

$> az login

{lists my subscriptions, the default subscription has the container registry in it}

$> sudo az acr login --name myconreg01

Login Succeeded

$> sudo docker-compose --verbose -f docker/my.docker-compose.yml pull

compose.config.config.find: Using configuration files: ./docker/my.docker-compose.yml
WARNING: compose.config.environment.__getitem__: The INSIGHTOPS_REGION variable is not set. Defaulting to a blank string.
WARNING: compose.config.environment.__getitem__: The INSIGHTOPS_TOKEN variable is not set. Defaulting to a blank string.
docker.auth.auth.load_config: Found 'auths' section
docker.auth.auth.parse_auth: Found entry (registry=u'https://index.docker.io/v1/', username=u'myUserName')
docker.auth.auth.parse_auth: Found entry (registry=u'myconreg01.azurecr.io', username=u'00000000-0000-0000-0000-000000000000')
docker.auth.auth.load_config: Found 'HttpHeaders' section
compose.cli.command.get_client: docker-compose version 1.8.0, build unknown
docker-py version: 1.9.0
CPython version: 2.7.12
OpenSSL version: OpenSSL 1.0.2g  1 Mar 2016
compose.cli.command.get_client: Docker base_url: http+docker://localunixsocket
compose.cli.command.get_client: Docker version: KernelVersion=4.15.0-1063-azure, Components=[{u'Version': u'18.09.7', u'Name': u'Engine', u'Details': {u'KernelVersion': u'4.15.0-1063-azure'... (truncated)
compose.service.pull: Pulling beat (myconreg01.azurecr.io/primary-worker-image:latest)...
compose.cli.verbose_proxy.proxy_callable: docker pull <- (u'myconreg01.azurecr.io/primary-worker-image', tag=u'latest', stream=True)
docker.api.image.pull: Looking for auth config
docker.auth.auth.resolve_authconfig: Looking for auth entry for u'myconreg01.azurecr.io'
docker.auth.auth.resolve_authconfig: Found u'myconreg01.azurecr.io'
docker.api.image.pull: Found auth config
ERROR: compose.cli.errors.log_api_error: Get https://myconreg01.azurecr.io/v2/primary-worker-image/manifests/latest: unauthorized: authentication required

@StingyJack You are running your docker-compose command as root (why?), while you run your az acr login command as another user. This command stores the docker registry credentials somewhere in the home directory of your user (~/.docker), therefore they can not be found by the root user.

@jszanto - I'm not running anything as root, they are all running as the "my-admin" account. Some are elevated, but still running as the "my-admin" account. Needing sudo for this seems to be how docker works "out of the box". I've had quite enough weird, frustrating, and time consuming trouble with docker (both the program and the org) over the last year or two that I'm not terribly interested in investing time into researching and attempting better configurations for it, I need it to _work_ consistently first.

I see the confusion point I created though, the az acr login command should have had sudo in front of it because it doesnt have access to read the config.json without it. I've corrected that typo, sorry about that.

For completeness this is the error I get when I forget to sudo the az acr login or the docker-compose commands.

$ az acr login --name myconreg01

An error occurred: DOCKER_COMMAND_ERROR
WARNING: Error loading config file: /home/my-admin/.docker/config.json: stat /home/my-admin/.docker/config.json: permission denied
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.39/containers/json: dial unix /var/run/docker.sock: connect: permission denied

This same command set works in other like environments so I dont think I need to change them. It would be nice to know what else may be needed in the environment to be able to connect to ACR.

EDIT: This was due to docker-compose being at an old version, like 1.8 or something like that. Updating to 1.24 fixed the problem.

@MicahMcKittrick-MSFT The issue can be easily reproduced on AKS 1.15.x. Could you please re-open it?

Here to confirm this is still happening in 1.15.7...

@MicahMcKittrick-MSFT The issue can be easily reproduced on AKS 1.15.x. Could you please re-open it?

Please disregard this. It turned out the tag in question didn't exist. After I fixed the name of the tag It started to work as intended. @johnwrobinson, this might happen to you as well.

unauthorized: authentication required for an absent tag is quite confusing though.

Bad (unhelpful, uninformative, or just wrong) error messages should be nontrivial deductions from the offenders compensation.

still this does not work for us. I mean the behaviour is inconsistent. It works sometimes and after few deployments it fails and if we start the deployment again. it works. No idea whats wrong here.

The original SP is with Contributor role and the same SPN has the AcrPullRole. If I list the role assignment with --all then I can see the role

From a previous comment earlier in the thread. My issue came from the tag of the image not being correct. Worth double-checking the repo and image tag exists.

actually I can login to az login using SP credentials and az acr login success. However the aks can't seem to pull the images

@MartinKosicky - check the SP has the AcrPullRole permission.

@colin-lyman actually I rememberd that I accidentally last week updated aks with a different SP (using terraform). I right away repaired it, and also "aks show" told me that it is correct. However it didn't work. Deleting and recreating the whole AKS helped here. It seems that updating the service principal on aks can really break some things.

I had the same problem now. I verified that the image tag was correct by pulling it on my local machine without problems.
I had scripted the process for granting aks pull access to acr, something copy-pasted from some Microsoft documentation at some point (unfortunately I did not save the link):

# Get the id of the service principal configured for AKS
CLIENT_ID=$(az aks show --resource-group $AKS_RESOURCE_GROUP --name $AKS_CLUSTER_NAME --query "servicePrincipalProfile.clientId" --output tsv)

# Get the ACR registry resource id
ACR_ID=$(az acr show --name $ACR_NAME --resource-group $ACR_RESOURCE_GROUP --query "id" --output tsv)

# Create role assignment
az role assignment create --assignee $CLIENT_ID --role acrpull --scope $ACR_ID

When trying to find the docs back again now, I found this new command (to me at least) here:
https://docs.microsoft.com/en-us/azure/aks/cluster-container-registry-integration

So, I tried that new command instead:

az aks update -n $AKS_CLUSTER_NAME -g $AKS_RESOURCE_GROUP --attach-acr $ACR_NAME

I had to upgrade my azure cli from v 2.0.61 to 2.1.0 to get the update operation available, and got this error after a while:

AAD role propagation done[############################################]  100.0000%request failed: Error occurred in request., RetryError: HTTPSConnectionPool(host='management.azure.com', port=443): Max retries exceeded with url: /subscriptions/<subscription-GUID>/resourceGroups/<myAKS-rgName>/providers/Microsoft.ContainerService/managedClusters/<myClusterName>?api-version=2019-11-01 (Caused by ResponseError('too many 500 error responses',))

...but after deleting my pod and have the deployment spin up a new one after this (no config changes), it was finally able to pull the image.

Not sure when and how this changed. I have pulled images successfully without changing this setup for months, but now all of a sudden it didn't work any more. Hope this helps someone else.

UPDATE:
It seems I was a bit too quick on my conclusion. It only worked transiently the first time after the above command. Transient absent of error, then. Quite frustrating.

OK. Found my error: The service principal had expired 🤦‍♂ (I wasn't aware/paying attention of the one year expiration time). Now that I know, I can also see it from the logs on the AKS resource in the Azure Portal.

To fix it, I followed the instructions here https://docs.microsoft.com/en-us/azure/aks/update-credentials

Sorry about the noise, and hope this may help someone else landing on this page.

OK. Found my error: The service principal had expired 🤦‍♂ (I wasn't aware/paying attention of the one year expiration time). Now that I know, I can also see it from the logs on the AKS resource in the Azure Portal.

To fix it, I followed the instructions here https://docs.microsoft.com/en-us/azure/aks/update-credentials

Sorry about the noise, and hope this may help someone else landing on this page.

Thanks Pal! You really save my day! That is exactly the same error for my case.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

varma31 picture varma31  ·  3Comments

JeffLoo-ong picture JeffLoo-ong  ·  3Comments

ianpowell2017 picture ianpowell2017  ·  3Comments

Agazoth picture Agazoth  ·  3Comments

Ponant picture Ponant  ·  3Comments