Prefect: Allow use of KubernetesJobEnvironment + S3 storage + KubernetesAgent

Created on 19 Jun 2020 · 8Comments · Source: PrefectHQ/prefect

Current behavior

As of #2666 , it's now possible to use non-docker storage with dockerized agents such as KubernetesAgent. This is a super exciting feature!

However, if you are using S3 storage and KubernetesAgent, it seems that it's not possible to customize the jobs created for flow runs using KubernetesJobEnvironment.

I've observed that no errors are generated today (prefect 0.12.0) when you use this combination:

S3 storage
KubernetesJobEnvironment
KubernetesAgent

The agent I've created is happily creating jobs in the cluster for flow runs, but the manifests for those jobs are all default values and ignore anything I customize in KubernetesJobEnvironment.

After digging for a bit, I found the root cause. From "Kubernetes Job Environment":

The KubernetesJobEnvironment accepts an argument job_spec_file which is a string representation of a path to a Kubernetes Job YAML file. On initialization that Job spec file is loaded and stored in the Environment. It will never be sent to Prefect Cloud and will only exist inside your Flow's Docker storage.

I can also see this in code in KubernetesJobEnvironment.create_flow_run_job(), where the environment explicitly expects a docker image containing the job_spec_file

https://github.com/PrefectHQ/prefect/blob/26518173cf722c742eefcae970260edb5e385f54/src/prefect/environments/execution/k8s/job.py#L116-L149

Proposed behavior

I'd like to be able to store the job_spec_file from KubernetesJobEnvironment in S3 storage, so that the job details of a flow run using S3 storage and run by KubernetesAgent can be customized.

Example

The benefits of non-docker Storage are explained in https://docs.prefect.io/orchestration/execution/storage_options.html#non-docker-storage-for-containerized-environments. Adding this proposed behavior would allow flows using KubernetesAgent to take advantage of that storage without sacrificing the ability to customize the jobs using KubernetesJobEnvironment.

Without this proposed behavior, I think users of KubernetesAgent have to choose between non-docker storage and customizing their jobs.

enhancement

Source

jameslamb

👍1

All 8 comments

@jcrist Is this something you should address in your environment refactor? I can see where the custom spec overlaps with the metadata image.

joshmeek on 19 Jun 2020

👀1

I've been thinking more about this...I think it could be accomplished by changing S3 storage to upload a .tar.gz instead of a single flow with the cloudpickle-ed flow.

Backwards compatibility could be preserved by changing S3.get_flow(). It could inspect the key of the object in S3 and say:

if that key ends in .tar.gz:
- download the tarball and untar it
else:
- current behavior (assume the object is a flow, read it straight into a stream and cloudpickle.load() it)

That would open the opportunity to bundle the job spec .yaml for KubernetesJobEnvironment and the cloudpickle-ed flow together, which I think would be enough to make it possible to use S3 storage and KubernetesJobEnvironment together :grinning:

jameslamb on 1 Jul 2020

Alternatively, could have the KubernetesJobEnvironment load the spec.yaml at build time rather than run time. Then the yaml file would be stored in the pickled flow, just like everything else. I can't think of any downsides of this behavior, and it should be simple enough to do.

jcrist on 1 Jul 2020

Alternatively, could have the KubernetesJobEnvironment load the spec.yaml at build time rather than run time. Then the yaml file would be stored in the pickled flow, just like everything else. I can't think of any downsides of this behavior, and it should be simple enough to do.

oh yeah that's way cleaner, I like that! It kind of looks like the file is already stored on the environment in the flow

https://github.com/PrefectHQ/prefect/blob/ffe03678f39d585bdddbb8908a2ce57438fc0f02/src/prefect/environments/execution/k8s/job.py#L84

jameslamb on 1 Jul 2020

Hi, I have a similar request: KubernetesAgent, KubernetesJobEnvironment and storage:GitHub.

I expected the file specified in KubernetesJobEnvironment to be picked up, and
prefect register flow validates that it can read the yaml file, but when running from KuvernetesAgent I get Failed to load and execute Flow's environment: FileNotFoundError(2, 'No such file or directory')

espenairmine on 10 Jul 2020

👀1

Alternatively, could have the KubernetesJobEnvironment load the spec.yaml at build time rather than run time. Then the yaml file would be stored in the pickled flow, just like everything else. I can't think of any downsides of this behavior, and it should be simple enough to do.

I just attempted this in #2950

jameslamb on 12 Jul 2020

Jw, is this closed by #2950?

joshmeek on 14 Aug 2020

@joshmeek I think it can be closed, yes.

I haven't tried with S3 storage since #2950 was merged, but I have been using another non-Docker storage (Webhook) successfully with KubernetesJobEnvironment + a custom spec file for a few days and it's been working exactly as expected.

jameslamb on 14 Aug 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings