Pipelines: Example is trying to mount hostPath for docker in docker

Created on 18 Dec 2018  路  18Comments  路  Source: kubeflow/pipelines

User reported this problem in this thread.
https://groups.google.com/forum/#!topic/kubeflow-discuss/5Y_7lhoQLIo

Example is failing because it is trying to mount the docker socket via hostPath.

They are running this example:
https://github.com/kubeflow/pipelines/blob/master/samples/notebooks/Lightweight%20Python%20components%20-%20basics.ipynb

The pod spec is below. The spec shows that it is trying to mount the docker socket. I'm guessing this is for docker in docker to build containers.

I'm not sure where this is coming from. The example in the notebook isn't explicitly building containers so not sure why it would need to do docker in docker.

Are Kubeflow pipelines always doing docker in docker?

apiVersion: v1
kind: Pod
metadata:
  annotations:
    openshift.io/scc: privileged
    workflows.argoproj.io/node-name: pipeline-flip-coin-xlkfl.flip
    workflows.argoproj.io/outputs: >-
      {"parameters":[{"name":"flip-output","value":"tails","valueFrom":{"path":"/tmp/output"}}],"artifacts":[{"name":"mlpipeline-ui-metadata","path":"/mlpipeline-ui-metadata.json","s3":{"endpoint":"minio-service.kubeflow:9000","bucket":"mlpipeline","insecure":true,"accessKeySecret":{"name":"mlpipeline-minio-artifact","key":"accesskey"},"secretKeySecret":{"name":"mlpipeline-minio-artifact","key":"secretkey"},"key":"runs/30850dfb-0180-11e9-bd47-063a66a580a8/pipeline-flip-coin-xlkfl-3596557372/mlpipeline-ui-metadata.tgz"}},{"name":"mlpipeline-metrics","path":"/mlpipeline-metrics.json","s3":{"endpoint":"minio-service.kubeflow:9000","bucket":"mlpipeline","insecure":true,"accessKeySecret":{"name":"mlpipeline-minio-artifact","key":"accesskey"},"secretKeySecret":{"name":"mlpipeline-minio-artifact","key":"secretkey"},"key":"runs/30850dfb-0180-11e9-bd47-063a66a580a8/pipeline-flip-coin-xlkfl-3596557372/mlpipeline-metrics.tgz"}}]}
    workflows.argoproj.io/template: >-
      {"name":"flip","inputs":{},"outputs":{"parameters":[{"name":"flip-output","valueFrom":{"path":"/tmp/output"}}],"artifacts":[{"name":"mlpipeline-ui-metadata","path":"/mlpipeline-ui-metadata.json","s3":{"endpoint":"minio-service.kubeflow:9000","bucket":"mlpipeline","insecure":true,"accessKeySecret":{"name":"mlpipeline-minio-artifact","key":"accesskey"},"secretKeySecret":{"name":"mlpipeline-minio-artifact","key":"secretkey"},"key":"runs/30850dfb-0180-11e9-bd47-063a66a580a8/pipeline-flip-coin-xlkfl-3596557372/mlpipeline-ui-metadata.tgz"}},{"name":"mlpipeline-metrics","path":"/mlpipeline-metrics.json","s3":{"endpoint":"minio-service.kubeflow:9000","bucket":"mlpipeline","insecure":true,"accessKeySecret":{"name":"mlpipeline-minio-artifact","key":"accesskey"},"secretKeySecret":{"name":"mlpipeline-minio-artifact","key":"secretkey"},"key":"runs/30850dfb-0180-11e9-bd47-063a66a580a8/pipeline-flip-coin-xlkfl-3596557372/mlpipeline-metrics.tgz"}}]},"metadata":{},"container":{"name":"","image":"python:alpine3.6","command":["sh","-c"],"args":["python
      -c \"import random; result = 'heads' if random.randint(0,1) == 0 else
      'tails'; print(result)\" | tee
      /tmp/output"],"resources":{}},"archiveLocation":{}}
  creationTimestamp: '2018-12-16T22:16:09Z'
  labels:
    workflows.argoproj.io/completed: 'true'
    workflows.argoproj.io/workflow: pipeline-flip-coin-xlkfl
  name: pipeline-flip-coin-xlkfl-3596557372
  namespace: kubeflow
  ownerReferences:
    - apiVersion: argoproj.io/v1alpha1
      blockOwnerDeletion: true
      controller: true
      kind: Workflow
      name: pipeline-flip-coin-xlkfl
      uid: 30850dfb-0180-11e9-bd47-063a66a580a8
  resourceVersion: '14833825'
  selfLink: /api/v1/namespaces/kubeflow/pods/pipeline-flip-coin-xlkfl-3596557372
  uid: 309010c0-0180-11e9-ac4e-0abcca1e707a
spec:
  containers:
    - args:
        - >-
          python -c "import random; result = 'heads' if random.randint(0,1) == 0
          else 'tails'; print(result)" | tee /tmp/output
      command:
        - sh
        - '-c'
      image: 'python:alpine3.6'
      imagePullPolicy: IfNotPresent
      name: main
      resources: {}
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: pipeline-runner-token-wffsv
          readOnly: true
    - args:
        - wait
      command:
        - argoexec
      env:
        - name: ARGO_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
      image: 'argoproj/argoexec:v2.2.1'
      imagePullPolicy: IfNotPresent
      name: wait
      resources: {}
      securityContext:
        privileged: false
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
        - mountPath: /argo/podmetadata
          name: podmetadata
        - mountPath: /var/lib/docker
          name: docker-lib
          readOnly: true
        - mountPath: /var/run/docker.sock
          name: docker-sock
          readOnly: true
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: pipeline-runner-token-wffsv
          readOnly: true
  dnsPolicy: ClusterFirst
  imagePullSecrets:
    - name: pipeline-runner-dockercfg-xpbn2
  nodeName: ip-10-0-48-147.us-east-2.compute.internal
  nodeSelector:
    node-role.kubernetes.io/compute: 'true'
  restartPolicy: Never
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: pipeline-runner
  serviceAccountName: pipeline-runner
  terminationGracePeriodSeconds: 30
  volumes:
    - downwardAPI:
        defaultMode: 420
        items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.annotations
            path: annotations
      name: podmetadata
    - hostPath:
        path: /var/lib/docker
        type: Directory
      name: docker-lib
    - hostPath:
        path: /var/run/docker.sock
        type: Socket
      name: docker-sock
    - name: pipeline-runner-token-wffsv
      secret:
        defaultMode: 420
        secretName: pipeline-runner-token-wffsv
status:
  conditions:
    - lastProbeTime: null
      lastTransitionTime: '2018-12-16T22:16:09Z'
      reason: PodCompleted
      status: 'True'
      type: Initialized
    - lastProbeTime: null
      lastTransitionTime: '2018-12-16T22:16:09Z'
      reason: PodCompleted
      status: 'False'
      type: Ready
    - lastProbeTime: null
      lastTransitionTime: '2018-12-16T22:16:09Z'
      status: 'True'
      type: PodScheduled
  containerStatuses:
    - containerID: >-
        docker://bc66e85bce78f14247b325b421ae321b1e5bc27c14fcab4b8c27d749f7690810
      image: 'docker.io/python:alpine3.6'
      imageID: >-
        docker-pullable://docker.io/python@sha256:766a961bf699491995cc29e20958ef11fd63741ff41dcc70ec34355b39d52971
      lastState: {}
      name: main
      ready: false
      restartCount: 0
      state:
        terminated:
          containerID: >-
            docker://bc66e85bce78f14247b325b421ae321b1e5bc27c14fcab4b8c27d749f7690810
          exitCode: 0
          finishedAt: '2018-12-16T22:16:15Z'
          reason: Completed
          startedAt: '2018-12-16T22:16:15Z'
    - containerID: >-
        docker://4dcbf5229f61a04b842281b01bc102789228c7519583c33c1c62ef2324a2830e
      image: 'docker.io/argoproj/argoexec:v2.2.1'
      imageID: >-
        docker-pullable://docker.io/argoproj/argoexec@sha256:9b12553aa7dccddc88c766d3dd59f4e8758acbd82ceef9e7aedc75f09934480a
      lastState: {}
      name: wait
      ready: false
      restartCount: 0
      state:
        terminated:
          containerID: >-
            docker://4dcbf5229f61a04b842281b01bc102789228c7519583c33c1c62ef2324a2830e
          exitCode: 0
          finishedAt: '2018-12-16T22:16:16Z'
          reason: Completed
          startedAt: '2018-12-16T22:16:16Z'
  hostIP: 10.0.48.147
  phase: Succeeded
  podIP: 10.129.2.12
  qosClass: BestEffort
  startTime: '2018-12-16T22:16:09Z'
aresamples kinbug prioritp1

Most helpful comment

This also breaks all workflows which should be executed on a k8s cluster which doesnt use docker. My current usecase is running argo inside k3s which uses containerd a pod executer.

All 18 comments

The docker socket is installed by argo for using "docker cp" to copy the artifact out from a container.
https://github.com/argoproj/argo/blob/master/workflow/controller/workflowpod.go#L48

I think this is the default behavior for openshift. User needs to relax the security constraint explicitly: https://docs.okd.io/latest/admin_guide/manage_scc.html#use-the-hostpath-volume-plugin

Thanks @hongye-sun. Does pipelines depend on this behavior of copying out the artifact using docker cp? Could pipelines instead just use a volume (e.g. emptyDir) to share data between containers.
Making the docker socket available to the pod seems like an undesirable escalation of privileges.

/cc @ioandr @vkoukis @pdmack @jessesuen

Yes, we highly rely on this behavior to get component outputs and upload pipeline artifacts. Currently, argo doesn't support other ways to copy file content from the main container. We might consider to use k8s API to copy the file content by implementing the copy methods in argo's k8s API executor. It requires non-trivial work.

Does it only affect openshift? From a web search, I don't see other providers (aws and azure) have similar issues.

/cc @Ark-kun

This is a more relevant bug in argo: https://github.com/argoproj/argo/issues/970
It looks like Argo team is planning to take care of this.

This also breaks all workflows which should be executed on a k8s cluster which doesnt use docker. My current usecase is running argo inside k3s which uses containerd a pod executer.

We've now upgraded to Argo 2.3. AFAIK there are many improvements to different executors. Let's check whether switching the executor fixes the problem.

I'm running Kubeflow v0.6.2. Pipelines still trying to mount hostPath:
Invalid value: "hostPath": hostPath volumes are not allowed to be used

Pipelines still trying to mount hostPath:

What Kubernetes environment do you use? Does this Argo sample work for you? https://github.com/argoproj/argo/blob/master/examples/artifact-passing.yaml

If you're using a Docker-less environment the first step would be to change Argo workflow controller configuration to non-Docker executor. See this thread: https://github.com/kubeflow/pipelines/issues/1654

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

Hi @Ark-kun I just had a look at this and the referenced argo issue. Is my assumption correct, that this ticket is not solved yet?

We are currently deploying KFP 1.0 and it seems that hostPath volumes are still required:

This step is in Error state with this message: pods "conditional-execution-pipeline-with-exit-handler-tnpv5-1956183255" is forbidden: unable to validate against any pod security policy: [spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used]

We are using k8s 1.14 with docker.

We were on the hand able to deploy argo directly and only emptyDir was required AFAIK and argo even seems to offer an option for putting the logs on a specific persistent volume, but this is not fully verified. pls ignore, switched it up with airflow...

Thanks in advance!

/reopen

@Jeffwan: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

The argo version in v1.1 still have the issue. This blocks one use case in EKS that we can not deploy kubeflow pipeline on EKS Fargate since Fargate doesn't support HostPath yet.

I am running a local cluster using kind and getting the same error. Here is what I get when I describe my pod using kubectl describe pod file-passing-pipelines-cclzh-2358551148 -n kubeflow:

Name:           file-passing-pipelines-cclzh-2358551148
Namespace:      kubeflow
Priority:       0
Node:           kind-worker/172.19.0.2
Start Time:     Mon, 24 Aug 2020 17:44:08 +0900
Labels:         pipelines.kubeflow.org/cache_enabled=true
                pipelines.kubeflow.org/cache_id=
                pipelines.kubeflow.org/metadata_context_id=1
                pipelines.kubeflow.org/metadata_execution_id=3
                workflows.argoproj.io/completed=false
                workflows.argoproj.io/workflow=file-passing-pipelines-cclzh
Annotations:    pipelines.kubeflow.org/component_ref: {}
                pipelines.kubeflow.org/component_spec:
                  {"implementation": {"container": {"args": [{"if": {"cond": {"isPresent": "start"}, "then": ["--start", {"inputValue": "start"}]}}, {"if": ...
                pipelines.kubeflow.org/execution_cache_key: f6594b8f0728df187ec4f26083654d7b147e9e512c2a0bbeb11138846e028a60
                pipelines.kubeflow.org/metadata_input_artifact_ids: []
                sidecar.istio.io/inject: false
                workflows.argoproj.io/node-name: file-passing-pipelines-cclzh.write-numbers
                workflows.argoproj.io/template:
                  {"name":"write-numbers","arguments":{},"inputs":{},"outputs":{"artifacts":[{"name":"write-numbers-numbers","path":"/tmp/outputs/numbers/da...
Status:         Pending
IP:
IPs:            <none>
Controlled By:  Workflow/file-passing-pipelines-cclzh
Containers:
  wait:
    Container ID:
    Image:         gcr.io/ml-pipeline/argoexec:v2.7.5-license-compliance
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      argoexec
      wait
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_POD_NAME:  file-passing-pipelines-cclzh-2358551148 (v1:metadata.name)
    Mounts:
      /argo/podmetadata from podmetadata (rw)
      /argo/secret/mlpipeline-minio-artifact from mlpipeline-minio-artifact (ro)
      /var/run/docker.sock from docker-sock (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from pipeline-runner-token-vvz7g (ro)
  main:
    Container ID:
    Image:         python:3.7
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      python3
      -u
      -c
      def _make_parent_dirs_and_return_path(file_path: str):
          import os
          os.makedirs(os.path.dirname(file_path), exist_ok=True)
          return file_path

      def write_numbers(numbers_path, start = 0, count = 10):
          with open(numbers_path, 'w') as writer:
              for i in range(start, count):
                  writer.write(str(i) + '\n')

      import argparse
      _parser = argparse.ArgumentParser(prog='Write numbers', description='')
      _parser.add_argument("--start", dest="start", type=int, required=False, default=argparse.SUPPRESS)
      _parser.add_argument("--count", dest="count", type=int, required=False, default=argparse.SUPPRESS)
      _parser.add_argument("--numbers", dest="numbers_path", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)
      _parsed_args = vars(_parser.parse_args())

      _outputs = write_numbers(**_parsed_args)

    Args:
      --count
      100000
      --numbers
      /tmp/outputs/numbers/data
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from pipeline-runner-token-vvz7g (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  podmetadata:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.annotations -> annotations
  docker-sock:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/docker.sock
    HostPathType:  Socket
  mlpipeline-minio-artifact:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  mlpipeline-minio-artifact
    Optional:    false
  pipeline-runner-token-vvz7g:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  pipeline-runner-token-vvz7g
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason       Age                   From                  Message
  ----     ------       ----                  ----                  -------
  Normal   Scheduled    53m                   default-scheduler     Successfully assigned kubeflow/file-passing-pipelines-cclzh-2358551148 to kind-worker
  Warning  FailedMount  47m                   kubelet, kind-worker  Unable to attach or mount volumes: unmounted volumes=[docker-sock], unattached volumes=[mlpipeline-minio-artifact pipeline-runner-token-vvz7g podmetadata docker-sock]: timed out waiting for the condition
  Warning  FailedMount  36m (x2 over 49m)     kubelet, kind-worker  Unable to attach or mount volumes: unmounted volumes=[docker-sock], unattached volumes=[pipeline-runner-token-vvz7g podmetadata docker-sock mlpipeline-minio-artifact]: timed out waiting for the condition
  Warning  FailedMount  32m (x2 over 45m)     kubelet, kind-worker  Unable to attach or mount volumes: unmounted volumes=[docker-sock], unattached volumes=[docker-sock mlpipeline-minio-artifact pipeline-runner-token-vvz7g podmetadata]: timed out waiting for the condition
  Warning  FailedMount  8m7s (x11 over 51m)   kubelet, kind-worker  Unable to attach or mount volumes: unmounted volumes=[docker-sock], unattached volumes=[podmetadata docker-sock mlpipeline-minio-artifact pipeline-runner-token-vvz7g]: timed out waiting for the condition
  Warning  FailedMount  2m24s (x33 over 53m)  kubelet, kind-worker  MountVolume.SetUp failed for volume "docker-sock" : hostPath type check failed: /var/run/docker.sock is not a socket file

I was able to get KFP working on kind. Thanks to the comments mentioned here: https://github.com/kubeflow/pipelines/issues/4256

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

I think I've run into this issue as well with Kubeflow 1.2 on Kubernetes 1.20 using containerd. Considering the deprecation of the dockershim that was announced, I think it might be a good idea to switch the on-prem kdef to use pns for the containerRuntimeExecutor.
https://github.com/kubeflow/pipelines/issues/1654#issuecomment-747183561

Was this page helpful?
0 / 5 - 0 ratings