Nextflow: Improve the support for Kubernetes cluster

Created on 1 Oct 2017  路  21Comments  路  Source: nextflow-io/nextflow

Currently NF interacts with a Kubernetes by using the kubectl command. This prevents to deploy NF itself as Pod.

This can be solved by allowing NF to access the Kubernetes API server via https connection. Interesting facts:

  • The k8s API server can be accessed by using the kubernetes DNS name automatically defined in each Pod;
  • Https authentication token can be retried in the following file /var/run/secrets/kubernetes.io/serviceaccount/token;
  • Https certification can be found in the following path /var/run/secrets/kubernetes.io/serviceaccount/ca.crt;
  • Default namespace is placed in a file at /var/run/secrets/kubernetes.io/serviceaccount/namespace in each container.

Also the following variables are defined in container environment:

KUBERNETES_PORT=tcp://10.0.0.1:443
KUBERNETES_PORT_443_TCP=tcp://10.0.0.1:443
KUBERNETES_PORT_443_TCP_ADDR=10.0.0.1
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_SERVICE_HOST=10.0.0.1
KUBERNETES_SERVICE_PORT=443
KUBERNETES_SERVICE_PORT_HTTPS=443

More details here.

help wanted

Most helpful comment

It required some head scratches but finally there's a shiny new built-in support for Kubernetes that greatly improves the previous implementation and provides a much more flexible integration with Nextflow.

What it includes

  • Kubernetes API client
  • Kubernetes native executor
  • Support for K8s persistent volume claims
  • Workflow deployment integrated in the standard run command, it only requires an extra cmd option.

Caveats

  • the workflow to be executed must be published in a GitHub (or equivalent service) repository.
  • it requires the use of a shared storage supported by K8s and exposed as persistent volume claim.

How does it work

1) Declare in your nextflow.config file one or more persistent volume claims adding a snippet similar to the following one:

    k8s {
       volumeClaims {
          'vol-pvc' { mountPath = '/nextflow' }
       }
    }

Replace vol-pvc with the name of your persistent volume. See the K8s documentation for details how to configure persistent volume claims in your cluster.

2) Launch a workflow execution (using the latest snapshot) using the -with-k8s command line option, eg

    NXF_VER=0.28.0-SNAPSHOT nextflow run rnaseq-nf -with-k8s

Nextflow creates and submit the execution of a pod in the K8s cluster that will act as the workflow driver application. Then this pod creates and schedules a worker pod for each task that is executed by your pipeline.

3) Once submitted the workflow execution, the foreground nextflow instance waits for the application to start and then prints the workflow output log, making easier for the user to follow the execution.

What to do next

There are still some details to polish, such as:

  • delete pods when the once the workflow complete
  • better define the user/application directory structure
  • nodes affinity/selection

However before to continue the implementation I need from the people interested in this feature a feedback with their comments, problems and proposed improvements.

All 21 comments

Just had a detailed discussion with @skptic on this. The suggestions from my side is that the following are highly desirable:

  1. avoid the need of the kubectl client (as described above). Either by using the REST API or maybe by using this Java client which has proved very useful to us
  2. provide support for OpenShift as well as plain K8S (in principle this should be simple with either approach)
  3. allow to specify a Pod template for the Pod definition that will allow a full range of criteria for the pod to be specified (currently you can only specify the image name, the cpu and the memory). This should allow:
  4. better control of how the pods are to be scheduled (e.g. prevent the cluster being overloaded with too many pods)
  5. provide support for the scenario where the nextflow workflow is orchestrated from a pod that is running in the cluster

Happy to help with planning and testing for this.

Hi Tim,

  1. I have already a java client for this. I could commit but I would need some help to test/debug it.
  2. Does OpenShift implements its own API or extends the K8S one? anyhow I think this should be managed as a separated enhancement.
  3. This need to be discussed in a separate issue
  4. It's already possible to limits the max number of submitted task by a config option. There's is also a separate request for that #198
  5. That's to be possible by point 1 and #446

Hi Paolo, I've implemented code in Squonk to launch Pods/Jobs from within the OpenShift environment using the fabric8.kubernetes.client. I can share my experiences if that would be of interest.

I've also done a little stress-testing by launching hundreds of concurrent competing Pods where I control the memory they allocate and how _busy_ they are by burning up CPU cycles using a small Python-based Docker image.

  1. OK. This uses the REST API?

Yes, I will commit on separate branch to work on that

  1. Extends the K8S API. In our case we're just scheduling a Pod so it should be an identical process between OS and K8S.

Exactly

  1. OK. I created #531

OK

  1. Yes, but it's the K8S scheduler that knows how best to schedule, not NF

I see. It could be tricky. We need to see in practice how to manage this.

@alanbchristie it could be useful, do you have some free cycles to contribute/test the NF-K8S integration ?

I've just commit the a basic k8s client. See the KubeClient. Now the problem is to use it in the place of the kubectl command in the KubernetesExecutor

@tdudgeon Let's chat tomorrow morning about NF-K8S testing/integration and see what needs to be done, when and what environment's needed.

Hi guys, I will be very happy to test everything related to k8s integration for Nextflow. Please let me know what's needed.

I have also just created this issue #549, do you have any suggestions on it?

Many thanks!
Vlad

Hi Vlad, any contribution on the Kubernetes support is more than welcome but maybe Friday night is not the best time to discuss about that :) We can catch up early next week if you agree.

Sure Paolo, thanks for you reply anyway! ;-) I will be in touch next week.

@wikiselev the goal of this issue is to enhance the kubernetes executor so that it uses the kubernetes API instead of relying on the kubectl command. This will allow a much greater flexibility on kubernetes deployment with NF.

In practical terms I've already implemented a client skeleton. What is missing is to replace the use of kubectl with the this client and run the proper tests.

It required some head scratches but finally there's a shiny new built-in support for Kubernetes that greatly improves the previous implementation and provides a much more flexible integration with Nextflow.

What it includes

  • Kubernetes API client
  • Kubernetes native executor
  • Support for K8s persistent volume claims
  • Workflow deployment integrated in the standard run command, it only requires an extra cmd option.

Caveats

  • the workflow to be executed must be published in a GitHub (or equivalent service) repository.
  • it requires the use of a shared storage supported by K8s and exposed as persistent volume claim.

How does it work

1) Declare in your nextflow.config file one or more persistent volume claims adding a snippet similar to the following one:

    k8s {
       volumeClaims {
          'vol-pvc' { mountPath = '/nextflow' }
       }
    }

Replace vol-pvc with the name of your persistent volume. See the K8s documentation for details how to configure persistent volume claims in your cluster.

2) Launch a workflow execution (using the latest snapshot) using the -with-k8s command line option, eg

    NXF_VER=0.28.0-SNAPSHOT nextflow run rnaseq-nf -with-k8s

Nextflow creates and submit the execution of a pod in the K8s cluster that will act as the workflow driver application. Then this pod creates and schedules a worker pod for each task that is executed by your pipeline.

3) Once submitted the workflow execution, the foreground nextflow instance waits for the application to start and then prints the workflow output log, making easier for the user to follow the execution.

What to do next

There are still some details to polish, such as:

  • delete pods when the once the workflow complete
  • better define the user/application directory structure
  • nodes affinity/selection

However before to continue the implementation I need from the people interested in this feature a feedback with their comments, problems and proposed improvements.

Hi Paolo, many thanks for the update! This week and maybe next week are a bit crazy for me, but I will get to this asap and will provide you with the feedback. Thanks again for all your cool work. Vlad.

Hi Paolo, do you an example of a working persistent volume claim? I tried the one from the kubernetes official page:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: myclaim
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 8Gi
  storageClassName: slow
  selector:
    matchLabels:
      release: "stable"
    matchExpressions:
      - {key: environment, operator: In, values: [dev]}

However, it looks like storageClassName: slow is not recognised by my system... At the moment testing on a local kubernetes cluster (the one from Docker you mentioned before, thanks btw!), however will also do on OpenStack pretty soon.

Hi Vlad, for local testing I've used a pv/c definition like the one below:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: vol-local
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  storageClassName: shared  
  hostPath:
    path: /Users/pditommaso/Sites 
---    
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: vol-pvc
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: shared  
  resources:
    requests:
      storage: 1Gi

Thanks Paolo, the claim worked for me!

What I've done:

  1. Kubernetes cluster is running with the persistent volume defined.
  2. Cloned the rnaseq-nf pipeline
  3. Added this to the nextflow.config:
k8s {
    volumeClaims {
       'vol-pvc' { mountPath = '/Users/vk6/k8s-pers-vol' }
    }
 }
  1. I cded to the rnaseq-nf folder and run this:
NXF_VER=0.28.0-SNAPSHOT nextflow run main.nf -with-k8s

Nextflow complains:

Not a valid project name: main.nf

The .nextflow.log:

Feb-19 11:33:33.966 [main] DEBUG nextflow.cli.Launcher - $> /Users/vk6/bin/nextflow run main.nf -with-k8s
Feb-19 11:33:34.139 [main] DEBUG nextflow.cli.Launcher - Operation aborted
nextflow.exception.AbortOperationException: Not a valid project name: main.nf
    at nextflow.scm.AssetManager.resolveName(AssetManager.groovy:246)
    at nextflow.scm.AssetManager.build(AssetManager.groovy:128)
    at nextflow.scm.AssetManager.<init>(AssetManager.groovy:112)
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:488)
    at org.codehaus.groovy.reflection.CachedConstructor.invoke(CachedConstructor.java:83)
    at org.codehaus.groovy.runtime.callsite.ConstructorSite$ConstructorSiteNoUnwrapNoCoerce.callConstructor(ConstructorSite.java:105)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallConstructor(CallSiteArray.java:60)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:235)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:255)
    at nextflow.k8s.K8sDriverLauncher.makeConfig(K8sDriverLauncher.groovy:177)
    at nextflow.k8s.K8sDriverLauncher.run(K8sDriverLauncher.groovy:116)
    at nextflow.cli.CmdRun.run(CmdRun.groovy:201)
    at nextflow.cli.Launcher.run(Launcher.groovy:427)
    at nextflow.cli.Launcher.main(Launcher.groovy:581)

Is there anything I am missing?

Don't forget that when using K8s the pipeline is going to run inside the kubernetes, therefore it cannot run a local script. Currently it needs to pull a pipeline project from GitHub.

If you use this command it should work:

NXF_VER=0.28.0-SNAPSHOT nextflow run rnaseq-nf -with-k8s

Thanks Paolo, ok, I've forked the rnaseq-nf repo, added the k8s configuration and pushed it back to my group cellgeni GitHub. Then run:

NXF_VER=0.28.0-SNAPSHOT nextflow run cellgeni/rnaseq-nf -with-k8s

and everything worked like a charm! Really cool stuff. Thanks a lot for working on this!

Is there anything else I can test locally while we are still working on k8s implementation on OpenStack?

Well, the main point is to understand how it works in a real production scenario and to smooth the deployment based on your feedback.

Ok, will be back soon.

Nice. I'm going to uploaded an updated snapshot tomorrow and some K8s documentation during the week.

Just committed some improvements here. Now there's a specific command to launch the execution in a k8s cluster. For example:

nextflow kuberun <pipeline-name> -v vol-claim:/mount/path

The <pipeline-name> argument can be project hosted in Git repository or the absolute path in the K8s cluster of an already deployed project.

The -v command line option allows the specification of the volume claim and the mount path.

BONUS: specifying login as pipeline name, it launches a Bash interactive session to login in the K8s preconfigured pod.

More details here.

Was this page helpful?
0 / 5 - 0 ratings