Nextflow: Exposing Kubernetes Secrets to 'worker' pods.

Created on 10 Apr 2018 · 34Comments · Source: nextflow-io/nextflow

Is there any way to expose kubernetes secrets to a pod running a NF pipeline?

One step of our pipeline pulls files from our internal file system which requires a password. At the moment we keep a file with secrets in the volume attached to k8s via persistent volume claim. This is not particularly secure, nor flexible.

It would be nice to be able to instruct NF to expose an API key, or user credential to a pod by either using existing secrets existing within k8s, or some argument to NF at the point where the pipeline is run with kubernetes?

kinfeature platfork8s prhigh

Source

theobarberbany

👍2

Most helpful comment

wheeee, that works brilliantly, and it goes in the config, so it is a k8s specific beforeScript. And now we can use our object stare. Thank you.

HelenCousins on 14 May 2018

🎉1 👍1

All 34 comments

Currently it's not possible. But this can be a nice improvement. Basically you want a way by which a NF process mount one more K8s secret in the task pod.

I think the relevant information to be specified would be the secret and the mount path. Is that right?

pditommaso on 11 Apr 2018

👍2

Hi Paolo,

After setting up a k8s cluster, we looked in a NF pod and saw this:

ubuntu@k8s-master:~/kubespray$ ./nextflow kuberun login -v testpvc:/mnt/gluster/
Pod started: grave-edison
grave-edison:/mnt/gluster/ubuntu# df -h
Filesystem          Size  Used Avail Use% Mounted on
none                125G  5.9G  120G   5% /
tmpfs                35G     0   35G   0% /dev
tmpfs                35G     0   35G   0% /sys/fs/cgroup
10.0.0.16:/gluster  500G   45M  500G   1% /mnt/gluster
/dev/vda1           125G  5.9G  120G   5% /etc/hosts
shm                  64M     0   64M   0% /dev/shm
tmpfs                35G   12K   35G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs                35G     0   35G   0% /sys/firmware

Where you can see that there is a mounted volume /run/secrets/kubernetes.io/serviceaccount and we thought that this can be used inside any NF pod.

Hope this helps, but maybe @theobarberbany can provide more details?

wikiselev on 12 Apr 2018

That path is used by K8s to expose cluster certificate and other auth related information in a pod, but it's not meant to be used for user secrets AFAIK.

My understanding is that user secrets can be consumed from a pod either as a file or a env var, therefore what you need is a way to specify how secrets need to be mapped in NF created pods.

pditommaso on 12 Apr 2018

👍2

Have a look at the K8s docs.

pditommaso on 12 Apr 2018

@pditommaso Yes, that's exactly what I was thinking about :) Secret and mount path, or env var will do although I have a preference for a mounted file as I think it's a little nicer.

theobarberbany on 12 Apr 2018

👍1

Ok, yes, I was wrong, what you said @pditommaso is exactly what we want, thanks to @theobarberbany !

wikiselev on 12 Apr 2018

And probably you have already done a similar thing for the Amazon integration? I suppose on the batch clusters the user's environment is exported to all of the jobs and there is no problem with secrets, however, on the cloud it does not work.

wikiselev on 12 Apr 2018

Nope, the AWS Batch does not provide a specific api for secrets.

pditommaso on 23 Apr 2018

I think something like mapping a secrets.yaml file that took secret names (e.g default-token-tz6ks) should do for testing?

(I know my default-token-tz6ks secret would ultimately be useless for any nextflow pipelines, but I dont have any others in my namespace right now!)

$kubectl describe secret default-token-tz6ks
Name:         default-token-tz6ks
Namespace:    default
Labels:       <none>
Annotations:  kubernetes.io/service-account.name=default
              kubernetes.io/service-account.uid=cfef93a3-42f8-11e8-8b4d-fa163e4c8cae

Type:  kubernetes.io/service-account-token

Data
====
namespace:  7 bytes
token:      XXXX
ca.crt:     1090 bytes

theobarberbany on 23 Apr 2018

I was thinking to add the ability to specify the secrets in the NF config file. For example:

k8s {
  secrets {
    'my-secret' { mount = '/mount/path' }
    'another-secret' { env = 'FOO' }
  }
}

Then NF will take care to define the pod spec accordingly. An important point is that if the secrets should be define for all pods in the same manner or it should be possible to configure them at task level.

pditommaso on 23 Apr 2018

This is nice, I can see why one might want to do it that way, but what about secrets already centrally managed by kubernetes? Would this define them there too?

https://godoc.org/k8s.io/api/core/v1#EnvFromSource.SecretRef

https://godoc.org/k8s.io/api/core/v1#VolumeSource.Secret

theobarberbany on 23 Apr 2018

My understanding is that in any case secrets can be consumed exposing them as files or env vars (expect for imagePulls), is that right?

If so, that would allow to expose any secret in a container just knowing its name. Does make sense ?

pditommaso on 23 Apr 2018

👍1

Sorry, I appear to have misread / misunderstood what you wrote above (brain isn't working today.. mondays!).

I'd initially interpreted it as defining secrets as plain text secret:secretData key value pairs in the nextflow config file! (Ultimately side stepping any kubernetes secrets tooling provided by the cluster)

What I now take from this is you expose them as secretName:{mountPath|envVar} where you are defining where to put the value of the secret in the pod at runtime, which is exactly what I was on about above 😆

theobarberbany on 23 Apr 2018

No, no, secrets must be denied either by the user or the system, NF would just expose in the pod.

pditommaso on 23 Apr 2018

👍1

Yes, sounds good!

theobarberbany on 23 Apr 2018

Could you provide an example of the definition of a secret you need to consume (want to make sure to cover your use case with the proposed solution).

pditommaso on 9 May 2018

The secret we need right now is a password (fairly short ASCII); I have it as a Kubernetes secret. We could use it as an environment variable most easily at the moment. We then need to feed it to a shell script that authenticates to an object store, I know how to do this once I have an environment variable.

HelenCousins on 9 May 2018

I've uploaded a snapshot that includes the ability to handle secret files. This can be done using the pod directive either the main script file. For example:

process someName {
  pod secret: 'foo/bar', mountPath: '/some/path'
  '''
  your_command --this --that /some/path
  '''
}

or in the nextflow.config file as shown below:

process {
  pod = [secret: 'foo/bar', mountPath: '/some/path']
}

The notation foo/bar specifies the secret foo and the key bar. The key part can be omitted.
Note that when the key is specified the mountPath represent the target secret file path. Referring the above example it will create a file named path in the /some directory.
When the key is omitted all secret files are mounted the in specified path. Referring the above example this means that all secret files will be available in the directory /some/path.

You can test it using the following command:

NXF_VER=0.30.0-SNAPSHOT nextflow kuberun .. etc

pditommaso on 10 May 2018

🎉1

I plan to test it tomorrow with @HelenCousins.

wikiselev on 10 May 2018

We have victory! As an additional request we have would like to run the same pipelino on LSF and K8s which require different setup scripts; so we need a way of detecting if k8s is in use or not.

HelenCousins on 11 May 2018

Yes, Paolo, many thanks, all worked amazingly well! So, as Helen mentioned above, at the moment because of the use of kuberun there is no need to define K8s profile in config and therefore the pipeline does not know whether it is run on K8s, so e.g. workflow.profile can still show standard (for Phil's RNAseq pipeline). We need to run some conditional script in beforeScript if we are on K8s cluster, that's what we want.

wikiselev on 11 May 2018

Having pipeline code depending on the infra sounds a bad pattern. All the struggle of NF is to keep the pipeline platform independent.

Could not the setup managed externally?

pditommaso on 11 May 2018

Hi Paolo, well, at the moment we add this to the process definition, e.g.:

    def output_d = new File( "/root/.irods/.irodsA" )
    if( !output_d.exists() ) {
        beforeScript "sh create_irods_file_using_credentials.sh"
    }

The thing here is that on the LSF farm /root/.irods/.irodsA file is created once and forever for every user and therefore iRods is always happy (this file is required for authentication). On K8s cluster we need to create it using create_irods_file_using_credentials.sh in every pod, therefore checking the file existence is probably the easiest option. Do you think it's a good workaround? Or can you recommend anything else?

wikiselev on 11 May 2018

I would do something like:

beforeScript "[[ ! /root/.irods/.irodsA ]] && sh create_irods_file_using_credentials.sh"

But in any case I would put that in the config file to decouple the configuration from the pipeline, eg

process.beforeScript "[[ ! /root/.irods/.irodsA ]] && sh create_irods_file_using_credentials.sh"

Even better using different config profiles.

pditommaso on 11 May 2018

👍1

Great, many thanks Paolo!

I still need to master the profiles, but I think this is what gonna do based on your previous comment - in nextflow.config:

profiles {
  standard {
    includeConfig 'conf/base.config'
  }
  k8s {
    includeConfig 'conf/k8s.config'
  }
  lsf {
    includeConfig 'conf/lsf.config'
  }
}

and then in conf/k8s.config:

process {
  $irods {
    pod = [secret: 'irods-secret', mountPath: '/secret']
    beforeScript = "sh create_irods_file_using_credentials.sh"
  }
}

And then invoke NF like this:

nextflow kuberun cellgeni/rnaseq -v testpvc:/home -profile k8s

And in this case we don't even need to test the presence of the iRods file ([[ ! /root/.irods/.irodsA ]]).

wikiselev on 11 May 2018

Yes, tho I think I have found a problem with profiles when using kuberun.

On Fri, May 11, 2018, 14:12 Vladimir Kiselev notifications@github.com
wrote:

Great, many thanks Paolo!

I still need to master the profiles, but I think this is what gonna do
based on your previous comment - in nextflow.config:

profiles {
standard {
includeConfig 'conf/base.config'
}
k8s {
includeConfig 'conf/k8s.config'
}
lsf {
includeConfig 'conf/lsf.config'
}
}

and then in conf/k8s.config:

process {
$irods {
pod = [secret: 'irods-secret', mountPath: '/secret']
beforeScript = "sh create_irods_file_using_credentials.sh"
}
}

And then invoke NF like this:

nextflow kuberun cellgeni/rnaseq -v testpvc:/home -profile k8s

And in this case we don't even need to test the presence of the iRods file
([[ ! /root/.irods/.irodsA ]]).

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/nextflow-io/nextflow/issues/651#issuecomment-388346026,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAx3SCRIg9I7CbPkbYBErG_cQVHWghl4ks5txYAUgaJpZM4TOXbI
.

pditommaso on 11 May 2018

Well I get a lot of

Caused by:
Process get terminated with an error exit status (127)

Command executed:

env > /home/ubuntu/ils

Command exit status:
127

When I try that with a script that exsts and is executable in bin; um. if I put the script in the script section it works just fine.

HelenCousins on 11 May 2018

Maybe beforeScript does not see the bin folder? But this issue is probably unrelated to this thread.

wikiselev on 11 May 2018

Weird, I was sure to have replied to this. Anyhow no, beforeScript does not automatically see the scripts in the bin folder. You can still use it using

process.beforeScript = "sh $baseDir/bin/create_irods_file_using_credentials.sh"

pditommaso on 14 May 2018

wheeee, that works brilliantly, and it goes in the config, so it is a k8s specific beforeScript. And now we can use our object stare. Thank you.

HelenCousins on 14 May 2018

🎉1 👍1

process.beforeScript = "sh $baseDir/bin/create_irods_file_using_credentials.sh"

I'm sure this was working last week, it is not working now, indeed at this point baseDir is set to : and I can't see how it ever worked.

(also; local changes to nextflow.config don't work (it pulls again every time, slowing development by a round of git add/commit/push{), which took me ages to spot; also also the errors when beforeScript is bad are astonishingly unhelpful (nextflow appears to say that the script block is wrong, when it isn't))

HelenCousins on 16 May 2018

(also; local changes to nextflow.config don't work (it pulls again every time, slowing development by a round of git add/commit/push{), which took me ages to spot; also also the errors when beforeScript is bad are astonishingly unhelpful (nextflow appears to say that the script block is wrong, when it isn't))

Please, let's try to keep the thread consistent. For different a topic open a separate issue, otherwise it will be lost.

pditommaso on 21 May 2018

The pod process directive now allows the definition of:

env variable
config maps
secretes
volume claims

The same settings can be defined in the k8s configuration scope.

Finally the k8s.volumeClaims setting has been deprecated and replaced by k8s.storageClaimName and k8s.storageMountPath.

For more details check the documentation.

You can test it using version 0.30.0-BETA2, as usual:

NXF_VER=0.30.0-BETA2 nextflow kuberun .. etc

pditommaso on 29 May 2018

😄1

This is really good! Should make Nextflow's k8s capabilities much more powerful!

theobarberbany on 29 May 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Error when parsing params starting with - or -- in quote

MaxUlysse · 3Comments

ignite error on aws EC2

rsuchecki · 3Comments

Allow access to manifest scope during workflow execution

ewels · 4Comments

process terminated for unknown reason but actually such job succeed on SGE cluster

Crabime · 6Comments

JSON output for `nextflow info`

ewels · 6Comments