Nomad: Nomad auth for private ECR repo not working

Created on 9 Nov 2017  路  22Comments  路  Source: hashicorp/nomad

Nomad version

0.5.6 and 0.7.0

Operating system and Environment details

CentOS 7.3 and CentOS 7.4 (on-premise datacenter)

Issue

Nomad not acknowledging the docker-credential-ecr-login credentials. I've followed this documentation: https://www.nomadproject.io/docs/drivers/docker.html#authentication and this is NOT working.

Reproduction steps

  • Create a docker-cfg file with the following contents:
{
    "credHelpers": {
        "<XYZ>.dkr.ecr.us-west-2.amazonaws.com": "ecr-login"
    },
    "credsStore": "ecr-login"
}
  • Configure the Nomad client to use a helper:
client {
  enabled   = true
  options   = {
    "docker.auth.config"     = "/root/.docker/config.json"
    "docker.auth.helper"     = "ecr-login"
  }
  • Use the job file below.

    Nomad Server logs (if appropriate)

N/A

Nomad Client logs (if appropriate)

Driver Failure   failed to initialize task "test-api" for alloc "c828eb7b-e396-d947-0d60-f6cf2064120d": Failed to find docker auth for repo "<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api": docker-credential-ecr-login with input "https://<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api" failed with stderr: 2017-11-09T06:22:10Z [ERROR] Error retrieving credentials: NoCredentialProviders: no valid providers in chain. Deprecated.
    For verbose messaging see aws.Config.CredentialsChainVerboseErrors
credentials not found in native keychain 

Job file (if appropriate)

job "test-api" {
  region = "us-west-2"
  datacenters = ["us-west-2"]
  type = "service"

  constraint {
      attribute = "${node.class}"
      value = "test-worker"
  }

  update {
   stagger      = "15s"
   max_parallel = 1
  }

  group "web" {
    # Specify the number of these tasks we want.
    count = 1

    # Create an individual task (unit of work). This particular
    # task utilizes a Docker container to front a web application.
    task "test-api" {
      # Specify the driver to be "docker". Nomad supports
      # multiple drivers.
      driver = "docker"
      # Configuration is specific to each driver.
      config {
        image = "<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api"

        port_map {
            http = 8080
        }
        labels {
            service = "${NOMAD_JOB_NAME}"
        }

        logging {
          type = "syslog"
          config {
            tag = "test-api"
          }
        }
      }

      # The service block tells Nomad how to register this service
      # with Consul for service discovery and monitoring.
      service {
        name = "${JOB}"
        # This tells Consul to monitor the service on the port
        # labled "http".
        port = "http"

        check {
          type     = "http"
          path     = "/v1/status"
          interval = "20s"
          timeout  = "2s"
        }
      }
      # Specify the maximum resources required to run the job,
      # include CPU, memory, and bandwidth.
      resources {
        cpu    = 300 # MHz
        memory = 2048 # MB

        network {
          mbits = 1
          port "http" {}
        }
      }
    }
  }
}

Note: I've tried this with Nomad version 0.5.6 as well and am receiving the following error:

Driver Failure  failed to initialize task "test-api" for alloc "ed86bb06-29ba-a31b-a006-e8633b075d90": Failed to pull `<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api`: unauthorized: authentication required. 

Am I doing something wrong?

themclient themdrivedocker typquestion

Most helpful comment

I got this working now on Amazon Linux and nomad _0.7.1_:

  1. Putting docker-ecr-login-helper to /usr/bin -聽Ensure everyone can execute it!
  2. Putting docker configuration to /etc/docker/config.json -聽Ensure everyone has read access to the file!
  3. Referencing to the config.json in the nomad configuration (default.hcl)
  4. Ensure instance has read access to the ECR via an IAM policy.

One should not to get disturbed by the nomad user account on the instance and the linux init system running nomad:

  • Putting the docker config in the /root/ folder gives access problems.
  • The file ~/.aws/credentials is not used by the init system.

Docker Config -聽config.json:

{
  "credHelpers": {
    "my.ecr.URL.amazonaws.com": "ecr-login"
  }
}

Nomad Config -聽default.hcl:

...
client {
  enabled = true

  options   = {
    "docker.auth.config"     = "/etc/docker/config.json"
    "docker.auth.helper"     = "ecr-login"
  }
}
...

All 22 comments

Is docker-credential-ecr on Nomad's $PATH? Does using "docker.auth.helper" = "ecr" as in the docs work?

Nomad and docker-credential-ecr-login are in /usr/local/bin/.
Tried "docker.auth.helper" = "ecr" and "docker.auth.helper" = "ecr-login" but neither work and throw the following error:

11/09/17 21:26:27 EST  Driver Failure   failed to initialize task "test-api" for alloc "a22ad474-0a09-24a8-1415-5287d123ddac": Failed to find docker auth for repo "<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api": docker-credential-ecr-login with input "https://<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api" failed with stderr: 2017-11-10T02:26:27Z [ERROR] Error retrieving credentials: NoCredentialProviders: no valid providers in chain. Deprecated.
        For verbose messaging see aws.Config.CredentialsChainVerboseErrors
credentials not found in native keychain

FYI, docker pull <XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api works as expected.

@ptarpan Can you try:

{
    "credHelpers": {
        "<XYZ>.dkr.ecr.us-west-2.amazonaws.com": "ecr-login"
    },
}

And

client {
  enabled   = true
  options   = {
    "docker.auth.config"     = "/root/.docker/config.json"
  }
}

Hey @dadgar,

Same issue:

Time                   Type            Description
11/10/17 15:48:22 EST  Restarting      Task restarting in 18.630682115s
11/10/17 15:48:22 EST  Driver Failure  failed to initialize task "test-api" for alloc "51f5de15-7892-e04a-058f-c01b27ff057e": Failed to pull `<XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api`: unauthorized: authentication required
11/10/17 15:48:21 EST  Driver          Downloading image <XYZ>.dkr.ecr.us-west-2.amazonaws.com/test-api:latest
11/10/17 15:48:21 EST  Task Setup      Building Task Directory
11/10/17 15:48:21 EST  Received        Task received by client

If its any help; I use ECS repos by running a cron script which works as expected using the following configuration:

Cron:

#!/bin/bash
eval $(/usr/local/bin/aws ecr get-login --region us-east-1)

Nomad Client Config (portion):

"options":{"docker.auth.config":"/root/.docker/config.json"

IAM Role Permissions:

"ecr:GetAuthorizationToken",
"ecr:DescribeRepositories",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:GetRepositoryPolicy",
"ecr:ListImages",
"ecr:DescribeImages",
"ecr:BatchGetImage"

Thx @jrasell. We use cron with instance roles in AWS but this is an issue for agents running in a datacenter. One quick question before I continue testing this on-prem: does eval live in a separate script that cron calls or does the crontab invoke the AWS CLI directly?

I can confirm this is happening as well. Adding the AWS keys as environment variables (and the corresponding keys) in the service fixed the issue for me. It seems like Nomad isn't reading AWS keys from IAM roles or the ~/.aws/credentials file.

My config files:

/etc/systemd/system/nomad.service

[Unit]
Description=Nomad
Documentation=https://nomadproject.io/docs/

[Service]
Environment=AWS_ACCESS_KEY_ID=<secret>
Environment=AWS_SECRET_ACCESS_KEY=<secret>
Environment=AWS_DEFAULT_REGION=us-east-1
ExecStart=/usr/bin/nomad agent -config /etc/nomad.d
ExecReload=/bin/kill -HUP $MAINPID
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

/etc/nomad.d/config.json

{
  "client": {
    "enabled": true,
    "options": {
      "docker.auth.config": "/root/.docker/config.json"
    }
  ...
}

/root/.docker/config.json

{
  "credHelpers": {
    "my.ecr.URL.amazonaws.com": "ecr-login"
  }
}

I got this working now on Amazon Linux and nomad _0.7.1_:

  1. Putting docker-ecr-login-helper to /usr/bin -聽Ensure everyone can execute it!
  2. Putting docker configuration to /etc/docker/config.json -聽Ensure everyone has read access to the file!
  3. Referencing to the config.json in the nomad configuration (default.hcl)
  4. Ensure instance has read access to the ECR via an IAM policy.

One should not to get disturbed by the nomad user account on the instance and the linux init system running nomad:

  • Putting the docker config in the /root/ folder gives access problems.
  • The file ~/.aws/credentials is not used by the init system.

Docker Config -聽config.json:

{
  "credHelpers": {
    "my.ecr.URL.amazonaws.com": "ecr-login"
  }
}

Nomad Config -聽default.hcl:

...
client {
  enabled = true

  options   = {
    "docker.auth.config"     = "/etc/docker/config.json"
    "docker.auth.helper"     = "ecr-login"
  }
}
...

Fixed this for us using @MatthiasScholz his remarks, thanks!

@MatthiasScholz Sorry to resurrect this old thread. May i ask you were you able to pull mixed images? What we are seeing is that, using your config we are able to pull ECR images, but images from DockerHub which do not require auth fails because it tries to pull that image from ECR:

05/29/18 13:03:12 UTC  Driver Failure  failed to initialize task "mysql" for alloc "73668dda-14c3-a8ec-d421-6ac9de16c42d": Failed to find docker auth for repo "registry.hub.docker.com/library/mysql": docker-credential-ecr-login with input "https://registry.hub.docker.com/library/mysql" failed with stderr: 2018-05-29T13:03:12Z [ERROR] Error parsing the serverURL: https://registry.hub.docker.com/library/mysql, error: docker-credential-ecr-login can only be used with Amazon Elastic Container Registry.

Is there a way to mix ECR private images and Public images?

It is a valid questions. We only use ECR since we want to have a bit more control of the used images.

When I get the documentation of AWS ECR Helper right:聽

With Docker 1.13.0 or greater, you can configure Docker to use different credential helpers for different registries. To use this credential helper for a specific ECR registry, create a credHelpers section with the URI of your ECR registry

->聽then it would mean mixed image pulling should be supported.

Did you try to play a little bit around with the configuration? Like:

{
  "credHelpers": {
    "registry.example.com": "registryhelper",
    "awesomereg.example.org": "hip-star",
    "unicorn.example.io": "vcbait"
  }
}

mentioned in the Docker documentation?

I am not 100%聽sure if it will work out since there is still the configuration of the default.hcl mentioned above. I did not tested it yet.

@MatthiasScholz

I solved this temporarily:

config {
        image = "registry.hub.docker.com/library/mysql:5.7.19"
        auth {
           username = "xxxxxxxx"
           password = "yyyyyyy"
        }
        force_pull = true

        volumes = [
          "/opt/mysql:/var/lib/mysql"
        ]
        port_map {
          mysql_port = 3306
        }
 }

It is not ideal but works for now.

This has been fixed in #4266 and released in 0.8.4-rc1. Nomad now correctly uses the AWS ECR helper as configured in your docker configuration.
Apologies for missing this issue in the list of ones this PR fixed.

@nickethier Is there also a summary on how to configure it properly now? I've updated to 0.8.4 and still can't figure it out.
Got my /root/.docker/config.json as:

{
  "credHelpers": {
    "my.ecr.URL.amazonaws.com": "ecr-login"
  }
}

And also aws credentials setup. When I do a manual pull it all works fine, so why doesn't Nomad just pick up these settings? Do I really also need to set the Nomad client properties? And what should they be (why are they needed at all I question, but ok)

Any directions or working sample would be welcome, as this is all confusing as hell..

Do not put the setup in the /root/ directory - only the root user has access to it then.
Nomad is not running as root.

Please check my comment above for the correct folders to use for the configuration.

Ok got it in the end.
So I moved the docker config into /etc/docker/ and then added
"docker.auth.config" = "/etc/docker/config.json" to my nomad config.

"docker.auth.helper" = "ecr-login" is not needed as that would probably make all docker pulls use ecr-login.

The missing link in the end was that the init system doesn't use the aws credentials file so you need to add the credential environment variables.

Finally! Thanks for the support 馃憤

@ptarpan Can you try:

{
    "credHelpers": {
        "<XYZ>.dkr.ecr.us-west-2.amazonaws.com": "ecr-login"
    },
}

And

client {
  enabled   = true
  options   = {
    "docker.auth.config"     = "/root/.docker/config.json"
  }
}

Do I need to make the client section at my nomad client or in Nomad server configuration stanza, please clarify ?

Ok got it in the end.
So I moved the docker config into /etc/docker/ and then added
"docker.auth.config" = "/etc/docker/config.json" to my nomad config.

"docker.auth.helper" = "ecr-login" is not needed as that would probably make all docker pulls use ecr-login.

The missing link in the end was that the init system doesn't use the aws credentials file so you need to add the credential environment variables.

Finally! Thanks for the support 馃憤

Sorry to quoting it, Did you moved the config.json of docker in client machine and updated the client.hcl file in the Client machine ? Please clarify

@nattvasan
Yes, it's all configured in the Nomad client part of the config. (The server doesn't execute jobs 馃槈 )

I ended up putting my AWS credentials in an environment file for my systemd setup for Nomad.

@momania Yes, I figured that lately.! i'm new to nomad, ! Figured out the issue and it resolved!

I ended up putting my AWS credentials in an environment file for my systemd setup for Nomad.

Yes, I did the same !

https://github.com/hashicorp/nomad/issues/3526#issuecomment-374890827 is definitely the way to go.

Thanks.

I can't make this work for the case we are assuming a role with docker-credential-ecr-login as documented here https://github.com/awslabs/amazon-ecr-credential-helper/issues/34.

"If you are working with an assumed role please set the environment variable: AWS_SDK_LOAD_CONFIG=true also."

I don't know where to put AWS_SDK_LOAD_CONFIG and AWS_PROFILE env variables. I tried everything:

  1. Setting the env variables in docker systemd.
  2. Setting the env variables in nomad systemd.
  3. Setting the env variables in /etc/profile.d/. This only works in the command line but not with nomad.
  4. Setting the env variables in the nomad job specification and using them as args in the docker config/args stanza.
Was this page helpful?
0 / 5 - 0 ratings