Nomad: Isolated Fork/Exec driver problem - Permission denied

Created on 22 Mar 2016  路  8Comments  路  Source: hashicorp/nomad

Nomad version

Nomad 0.3.0 and 0.3.1

Operating system and Environment details

Amazon Linux running the following kernel version:

Linux 4.1.19-24.31.amzn1.x86_64 x86_64 GNU/Linux

Issue

It seems there's an issue while executing a job that uses the isolated exec driver. I get permission denied while running simple jobs that for example pings a website. I started a thread in the google group and @dadgar suggested moving the conversation to an issue. At the time of that thread I was using Nomad 0.3.0, we've moved to 0.3.1 and the same thing's happening.

$ nomad alloc-status 619ea328
ID              = 619ea328
Eval ID         = dfdf0bf7
Name            = staging-health.daemon[0]
Node ID         = 8b976f92
Job ID          = staging-health
Client Status   = failed
Evaluated Nodes = 10
Filtered Nodes  = 6
Exhausted Nodes = 0
Allocation Time = 179.833碌s
Failures        = 0

==> Task "health" is "dead"
Recent Events:
Time                   Type               Description
21/03/16 18:26:31 VET  Restarts Exceeded  Task exceeded restart policy
21/03/16 18:26:31 VET  Driver Failure     error starting process via the plugin: error starting command: fork/exec /bin/ping: permission denied
21/03/16 18:26:29 VET  Received           Task received by client

==> Status
Allocation "619ea328" status "failed" (6/10 nodes filtered)
  * Class "dev" filtered 2 nodes
  * Class "prod" filtered 4 nodes
  * Constraint "${node.class} = staging" filtered 6 nodes
  * Score "8b976f92-2bfa-83bf-458c-7a9159006400.binpack" = 17.857757
  * Score "d4b067c4-b831-db76-8256-f9e8959fa8ae.binpack" = 17.476360
  * Score "52b3d380-ffa2-e6b9-1761-12d646cb4511.binpack" = 1.460294
  * Score "88999bee-acb8-8399-da9c-b98002ff00d5.binpack" = 1.460294

==> Task Resources
Task: "health"
CPU  Memory MB  Disk MB  IOPS  Addresses
20   10         300      0     http: 172.17.18.167:20708

Job file

job "staging-health" {
  type     = "service"
  priority = 50

  constraint {
    attribute = "${node.class}"
    value     = "staging"
  }

  update {
    stagger      = "30s"
    max_parallel = 1
  }

  group "ping" {
    count = 1

    restart {
      attempts = 15
      delay    = "15s"
      interval = "5m"
      mode     = "delay"
    }

    task "health" {
      driver = "exec"

      config {
        command = "/bin/ping"
        args    = ["-c", "20", "google.com"]
      }

      resources {
        cpu    = 20
        memory = 10

        network {
          mbits = 2
          port  "http"{
          }
        }
      }
    }
  }
}

I hope this is useful! Thanks.

themclient typbug

Most helpful comment

I think that I found a cause of the issue. I have something like that
03/14/17 14:04:51 CET Driver Failure failed to start task "app" for alloc "54a0c7cb-ca12-43b1-0bb7-057ab60940c6": failed to start command path="usr/lib/jvm/java-1.7.0-openjdk-1.7.0.131.x86_64/jre/bin/java" --- args=["usr/lib/jvm/java-1.7.0-openjdk-1.7.0.131.x86_64/jre/bin/java" "-Xmx512m" "-Xms256m" "-Dserver.port=42693" "-jar" "/tmp/app.jar"]: fork/exec usr/lib/jvm/java-1.7.0-openjdk-1.7.0.131.x86_64/jre/bin/java: permission denied

When I looked into directory tmp located in allocation I saw:
-rw-rw---- 1 root root 45343103 Mar 14 13:02 app.jar

then less of executor.out:
2017/03/14 13:06:45.765696 [DEBUG] executor: launching command /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.131.x86_64/jre/bin/java -Xmx512m -Xms256m -Dserver.port=32386 -jar /tmp/app.jar 2017/03/14 13:06:45.765716 [DEBUG] 2017/03/14 13:06:45.765696 [DEBUG] executor: launching command /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.131.x86_64/jre/bin/java -Xmx512m -Xms256m -Dserver.port=32386 -jar /tmp/app.jar 2017/03/14 13:06:45.765716 [DEBUG] executor: running command as nobody

The crucial part is executor: running command as nobody and lack of read permission for nobody :)
-rw-rw---- 1 root root 45343103 Mar 14 13:02 app.jar

Unfortunately I don't know how to add read permission on Artifact stanza :(
Any help appreciated :)

All 8 comments

What AMI are you using. I could not reproduce on a recent Amazon Linux AMI.

Sorry I forgot to mention this, I'm running in amzn-ami-hvm-2015.09.2.x86_64-gp2 but yum updates has been run so kernel and packages were updated.

Is #1009 related to this?

@consultantRR I do not believe they are related. Is there something you are seeing that leads you to that? May help debugging

For the following drivers:
exec, java

running on RHEL 6.5 we get the same error - Permission Denied:

10/20/16 11:12:09 CEST Driver Failure failed to start task 'config' for alloc '83253d09-b591-7be7-b486-3714d04fc859': fork/exec /usr/bin/java: permission denied

This made no difference with setting user.

running raw_exec had no problem, when not setting a user, whilst if a user was configured in the task, the same error as above was experienced

@czerwina can you try to check the alloc directory permissions for the user you're using (eg try temporarily to chmod o+rwX -R the entire alloc dir. Make sure to revert back to sane settings after testing)?

I think that I found a cause of the issue. I have something like that
03/14/17 14:04:51 CET Driver Failure failed to start task "app" for alloc "54a0c7cb-ca12-43b1-0bb7-057ab60940c6": failed to start command path="usr/lib/jvm/java-1.7.0-openjdk-1.7.0.131.x86_64/jre/bin/java" --- args=["usr/lib/jvm/java-1.7.0-openjdk-1.7.0.131.x86_64/jre/bin/java" "-Xmx512m" "-Xms256m" "-Dserver.port=42693" "-jar" "/tmp/app.jar"]: fork/exec usr/lib/jvm/java-1.7.0-openjdk-1.7.0.131.x86_64/jre/bin/java: permission denied

When I looked into directory tmp located in allocation I saw:
-rw-rw---- 1 root root 45343103 Mar 14 13:02 app.jar

then less of executor.out:
2017/03/14 13:06:45.765696 [DEBUG] executor: launching command /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.131.x86_64/jre/bin/java -Xmx512m -Xms256m -Dserver.port=32386 -jar /tmp/app.jar 2017/03/14 13:06:45.765716 [DEBUG] 2017/03/14 13:06:45.765696 [DEBUG] executor: launching command /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.131.x86_64/jre/bin/java -Xmx512m -Xms256m -Dserver.port=32386 -jar /tmp/app.jar 2017/03/14 13:06:45.765716 [DEBUG] executor: running command as nobody

The crucial part is executor: running command as nobody and lack of read permission for nobody :)
-rw-rw---- 1 root root 45343103 Mar 14 13:02 app.jar

Unfortunately I don't know how to add read permission on Artifact stanza :(
Any help appreciated :)

Is this on the roadmap to be fixed?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hynek picture hynek  路  3Comments

byronwolfman picture byronwolfman  路  3Comments

dvusboy picture dvusboy  路  3Comments

joliver picture joliver  路  3Comments

jrasell picture jrasell  路  3Comments