Serving: Run Knative service on specific machine type

Created on 11 Sep 2018 · 3Comments · Source: knative/serving

Copied from: https://issuetracker.google.com/issues/114402172

Question:

I'm evaluating https://cloud.google.com/knative/ for deploying a (non-tensorflow) machine learning inference script as a serverless function. Ideally it would be deployed on a GPU node, though a node with many CPU cores could work too. I understand it is currently impossible to use node pools or node selectors with Knative. I would like to request that feature.

(Originally asked on Stack Overflow: https://stackoverflow.com/questions/52142219/knative-run-service-on-specific-machine-type)

+clarification
I have a processing-intensive operation that is run a few times per day, on demand, exposed as an HTTP endpoint. To prevent wasting money, I'd like to turn this machine on only when it's called, and turn it off again if it hasn't been called for a few minutes. That sounds like it could be solved quite elegantly using Knative Serving's scale-from/to-zero feature.

The problem is, my processing-intensive operation needs a lot of CPU cores, or even better, a GPU. If I understand correctly, currently Knative expects a homogenous cluster, where the Knative controller/autoscaler/etc., needs to run on the same node type as the actual workload. To scale a GPU cluster from zero, the controller would also need to run on a GPU machine, which would nullify any cost savings. Is that correct?

areAPI areautoscale kinquestion

Source

evankanderson

Most helpful comment

If you are using GPUs, you should be able to use resource requests to indicate that your knative workers need a GPU, i.e. fill out "resources" in your container spec:

kind: Configuration
....
spec:
  revisionTemplate:
    spec:
      container: 
        resources:
          limits:
            nvidia.com/gpu: 1

https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#v1-8-onwards

This would work for the GPU case, but would not work for e.g. cpu=ARM, which I don't think is exposed as a resource type.

One caution (based on your comments) -- knative assumes that all Pods backing a Revision can do an equal amount of work (currently scaled off CPU usage), so if you have substantial difference in nodes (some with GPUs and some not) or substantial non-CPU resource usage (e.g. 100% GPU but 10% CPU), the autoscaling probably won't work properly.

Note that knative will only scale Pods within a cluster, and will not automatically increase and decrease the Nodes in your cluster -- you'll need to use a separate autoscaler for that. Depending on the duration of your requests, you may also find that some time out before Node autoscaling completes if you supply more traffic than cluster can currently handle. We're exploring options to improve the speed of Node provisioning, but I expect the initial version may not work with custom resources like GPUs.

evankanderson on 11 Sep 2018

👍2

All 3 comments

If you are using GPUs, you should be able to use resource requests to indicate that your knative workers need a GPU, i.e. fill out "resources" in your container spec:

kind: Configuration
....
spec:
  revisionTemplate:
    spec:
      container: 
        resources:
          limits:
            nvidia.com/gpu: 1

https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#v1-8-onwards

This would work for the GPU case, but would not work for e.g. cpu=ARM, which I don't think is exposed as a resource type.

evankanderson on 11 Sep 2018

👍2

This is a great use-case for why we should allow resources in the ContainerSpec of Revisions.

bbrowning on 27 Sep 2018

We actually have an issue (and PR) tracking (and adding) support for the resources block.

So I'm going to close this in favor of that tracking issue: https://github.com/knative/serving/issues/2099

Feel free to reopen, if your disagree.

mattmoor on 31 Oct 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Install knative in bare metal cluster failed

ysjjovo · 5Comments

Installing bundled Istio frequently fails with "no matches for kind ..."

bbrowning · 6Comments

Could not resolve host

maxiloEmmmm · 4Comments

latestRevision routes may point to older revisions if latestCreated is not ready

greghaynes · 4Comments

Rename ela-system namespace

mattmoor · 7Comments