Nextflow: Exposing Kubernetes Secrets to 'worker' pods.

Created on 10 Apr 2018  Â·  34Comments  Â·  Source: nextflow-io/nextflow

Is there any way to expose kubernetes secrets to a pod running a NF pipeline?

One step of our pipeline pulls files from our internal file system which requires a password. At the moment we keep a file with secrets in the volume attached to k8s via persistent volume claim. This is not particularly secure, nor flexible.

It would be nice to be able to instruct NF to expose an API key, or user credential to a pod by either using existing secrets existing within k8s, or some argument to NF at the point where the pipeline is run with kubernetes?

kinfeature platfork8s prhigh

Most helpful comment

wheeee, that works brilliantly, and it goes in the config, so it is a k8s specific beforeScript. And now we can use our object stare. Thank you.

All 34 comments

Currently it's not possible. But this can be a nice improvement. Basically you want a way by which a NF process mount one more K8s secret in the task pod.

I think the relevant information to be specified would be the secret and the mount path. Is that right?

Hi Paolo,

After setting up a k8s cluster, we looked in a NF pod and saw this:

ubuntu@k8s-master:~/kubespray$ ./nextflow kuberun login -v testpvc:/mnt/gluster/
Pod started: grave-edison
grave-edison:/mnt/gluster/ubuntu# df -h
Filesystem          Size  Used Avail Use% Mounted on
none                125G  5.9G  120G   5% /
tmpfs                35G     0   35G   0% /dev
tmpfs                35G     0   35G   0% /sys/fs/cgroup
10.0.0.16:/gluster  500G   45M  500G   1% /mnt/gluster
/dev/vda1           125G  5.9G  120G   5% /etc/hosts
shm                  64M     0   64M   0% /dev/shm
tmpfs                35G   12K   35G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs                35G     0   35G   0% /sys/firmware

Where you can see that there is a mounted volume /run/secrets/kubernetes.io/serviceaccount and we thought that this can be used inside any NF pod.

Hope this helps, but maybe @theobarberbany can provide more details?

That path is used by K8s to expose cluster certificate and other auth related information in a pod, but it's not meant to be used for user secrets AFAIK.

My understanding is that user secrets can be consumed from a pod either as a file or a env var, therefore what you need is a way to specify how secrets need to be mapped in NF created pods.

Have a look at the K8s docs.

@pditommaso Yes, that's exactly what I was thinking about :) Secret and mount path, or env var will do although I have a preference for a mounted file as I think it's a little nicer.

Ok, yes, I was wrong, what you said @pditommaso is exactly what we want, thanks to @theobarberbany !

And probably you have already done a similar thing for the Amazon integration? I suppose on the batch clusters the user's environment is exported to all of the jobs and there is no problem with secrets, however, on the cloud it does not work.

Nope, the AWS Batch does not provide a specific api for secrets.

I think something like mapping a secrets.yaml file that took secret names (e.g default-token-tz6ks) should do for testing?

(I know my default-token-tz6ks secret would ultimately be useless for any nextflow pipelines, but I dont have any others in my namespace right now!)

$kubectl describe secret default-token-tz6ks
Name:         default-token-tz6ks
Namespace:    default
Labels:       <none>
Annotations:  kubernetes.io/service-account.name=default
              kubernetes.io/service-account.uid=cfef93a3-42f8-11e8-8b4d-fa163e4c8cae

Type:  kubernetes.io/service-account-token

Data
====
namespace:  7 bytes
token:      XXXX
ca.crt:     1090 bytes

I was thinking to add the ability to specify the secrets in the NF config file. For example:

k8s {
  secrets {
    'my-secret' { mount = '/mount/path' }
    'another-secret' { env = 'FOO' }
  }
} 

Then NF will take care to define the pod spec accordingly. An important point is that if the secrets should be define for all pods in the same manner or it should be possible to configure them at task level.

This is nice, I can see why one might want to do it that way, but what about secrets already centrally managed by kubernetes? Would this define them there too?

https://godoc.org/k8s.io/api/core/v1#EnvFromSource.SecretRef

https://godoc.org/k8s.io/api/core/v1#VolumeSource.Secret

My understanding is that in any case secrets can be consumed exposing them as files or env vars (expect for imagePulls), is that right?

If so, that would allow to expose any secret in a container just knowing its name. Does make sense ?

Sorry, I appear to have misread / misunderstood what you wrote above (brain isn't working today.. mondays!).

I'd initially interpreted it as defining secrets as plain text secret:secretData key value pairs in the nextflow config file! (Ultimately side stepping any kubernetes secrets tooling provided by the cluster)

What I now take from this is you expose them as secretName:{mountPath|envVar} where you are defining where to put the value of the secret in the pod at runtime, which is exactly what I was on about above 😆

No, no, secrets must be denied either by the user or the system, NF would just expose in the pod.

Yes, sounds good!

Could you provide an example of the definition of a secret you need to consume (want to make sure to cover your use case with the proposed solution).

The secret we need right now is a password (fairly short ASCII); I have it as a Kubernetes secret. We could use it as an environment variable most easily at the moment. We then need to feed it to a shell script that authenticates to an object store, I know how to do this once I have an environment variable.

I've uploaded a snapshot that includes the ability to handle secret files. This can be done using the pod directive either the main script file. For example:

process someName {
  pod secret: 'foo/bar', mountPath: '/some/path'
  '''
  your_command --this --that /some/path
  '''
}

or in the nextflow.config file as shown below:

process {
  pod = [secret: 'foo/bar', mountPath: '/some/path']
}

The notation foo/bar specifies the secret foo and the key bar. The key part can be omitted.
Note that when the key is specified the mountPath represent the target secret file path. Referring the above example it will create a file named path in the /some directory.
When the key is omitted all secret files are mounted the in specified path. Referring the above example this means that all secret files will be available in the directory /some/path.

You can test it using the following command:

NXF_VER=0.30.0-SNAPSHOT nextflow kuberun .. etc 

I plan to test it tomorrow with @HelenCousins.

We have victory! As an additional request we have would like to run the same pipelino on LSF and K8s which require different setup scripts; so we need a way of detecting if k8s is in use or not.

Yes, Paolo, many thanks, all worked amazingly well! So, as Helen mentioned above, at the moment because of the use of kuberun there is no need to define K8s profile in config and therefore the pipeline does not know whether it is run on K8s, so e.g. workflow.profile can still show standard (for Phil's RNAseq pipeline). We need to run some conditional script in beforeScript if we are on K8s cluster, that's what we want.

Having pipeline code depending on the infra sounds a bad pattern. All the struggle of NF is to keep the pipeline platform independent.

Could not the setup managed externally?

Hi Paolo, well, at the moment we add this to the process definition, e.g.:

    def output_d = new File( "/root/.irods/.irodsA" )
    if( !output_d.exists() ) {
        beforeScript "sh create_irods_file_using_credentials.sh"
    }

The thing here is that on the LSF farm /root/.irods/.irodsA file is created once and forever for every user and therefore iRods is always happy (this file is required for authentication). On K8s cluster we need to create it using create_irods_file_using_credentials.sh in every pod, therefore checking the file existence is probably the easiest option. Do you think it's a good workaround? Or can you recommend anything else?

I would do something like:

beforeScript "[[ ! /root/.irods/.irodsA ]] && sh create_irods_file_using_credentials.sh"

But in any case I would put that in the config file to decouple the configuration from the pipeline, eg

process.beforeScript "[[ ! /root/.irods/.irodsA ]] && sh create_irods_file_using_credentials.sh"

Even better using different config profiles.

Great, many thanks Paolo!

I still need to master the profiles, but I think this is what gonna do based on your previous comment - in nextflow.config:

profiles {
  standard {
    includeConfig 'conf/base.config'
  }
  k8s {
    includeConfig 'conf/k8s.config'
  }
  lsf {
    includeConfig 'conf/lsf.config'
  }
}

and then in conf/k8s.config:

process {
  $irods {
    pod = [secret: 'irods-secret', mountPath: '/secret']
    beforeScript = "sh create_irods_file_using_credentials.sh"
  }
}

And then invoke NF like this:

nextflow kuberun cellgeni/rnaseq -v testpvc:/home -profile k8s

And in this case we don't even need to test the presence of the iRods file ([[ ! /root/.irods/.irodsA ]]).

Yes, tho I think I have found a problem with profiles when using kuberun.

On Fri, May 11, 2018, 14:12 Vladimir Kiselev notifications@github.com
wrote:

Great, many thanks Paolo!

I still need to master the profiles, but I think this is what gonna do
based on your previous comment - in nextflow.config:

profiles {
standard {
includeConfig 'conf/base.config'
}
k8s {
includeConfig 'conf/k8s.config'
}
lsf {
includeConfig 'conf/lsf.config'
}
}

and then in conf/k8s.config:

process {
$irods {
pod = [secret: 'irods-secret', mountPath: '/secret']
beforeScript = "sh create_irods_file_using_credentials.sh"
}
}

And then invoke NF like this:

nextflow kuberun cellgeni/rnaseq -v testpvc:/home -profile k8s

And in this case we don't even need to test the presence of the iRods file
([[ ! /root/.irods/.irodsA ]]).

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/nextflow-io/nextflow/issues/651#issuecomment-388346026,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAx3SCRIg9I7CbPkbYBErG_cQVHWghl4ks5txYAUgaJpZM4TOXbI
.

Well I get a lot of

Caused by:
Process get terminated with an error exit status (127)

Command executed:

env > /home/ubuntu/ils

Command exit status:
127

When I try that with a script that exsts and is executable in bin; um. if I put the script in the script section it works just fine.

Maybe beforeScript does not see the bin folder? But this issue is probably unrelated to this thread.

Weird, I was sure to have replied to this. Anyhow no, beforeScript does not automatically see the scripts in the bin folder. You can still use it using

process.beforeScript = "sh $baseDir/bin/create_irods_file_using_credentials.sh"

wheeee, that works brilliantly, and it goes in the config, so it is a k8s specific beforeScript. And now we can use our object stare. Thank you.

process.beforeScript = "sh $baseDir/bin/create_irods_file_using_credentials.sh"

I'm sure this was working last week, it is not working now, indeed at this point baseDir is set to : and I can't see how it ever worked.

(also; local changes to nextflow.config don't work (it pulls again every time, slowing development by a round of git add/commit/push{), which took me ages to spot; also also the errors when beforeScript is bad are astonishingly unhelpful (nextflow appears to say that the script block is wrong, when it isn't))

(also; local changes to nextflow.config don't work (it pulls again every time, slowing development by a round of git add/commit/push{), which took me ages to spot; also also the errors when beforeScript is bad are astonishingly unhelpful (nextflow appears to say that the script block is wrong, when it isn't))

Please, let's try to keep the thread consistent. For different a topic open a separate issue, otherwise it will be lost.

The pod process directive now allows the definition of:

  • env variable
  • config maps
  • secretes
  • volume claims

The same settings can be defined in the k8s configuration scope.

Finally the k8s.volumeClaims setting has been deprecated and replaced by k8s.storageClaimName and k8s.storageMountPath.

For more details check the documentation.

You can test it using version 0.30.0-BETA2, as usual:

NXF_VER=0.30.0-BETA2 nextflow kuberun .. etc

This is really good! Should make Nextflow's k8s capabilities much more powerful!

Was this page helpful?
0 / 5 - 0 ratings