Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.): No
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): jobs, activeDeadlineSeconds, restartPolicy
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
Kubernetes version (use kubectl version): Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.6", GitCommit:"e569a27d02001e343cb68086bc06d47804f62af6", GitTreeState:"clean", BuildDate:"2016-11-12T05:22:15Z", GoVersion:"go1.7.1", Compiler:"gc", Platform:"darwin/amd64"}
Environment:
uname -a): darwinWhat happened:
Created a job, specified activeDeadlineSeconds as per documentation which explicitly says:
However, if you prefer not to retry forever, you can set a deadline on the job. Do this by setting the spec.activeDeadlineSeconds field of the job to a number of seconds. The job will have status with reason: DeadlineExceeded. No more pods will be created, and existing pods will be deleted.
What happened instead:

What you expected to happen:
Job to not restart when DeadlineExceeded exceeded as per documentation.
How to reproduce it (as minimally and precisely as possible):
Create a job that exits with greater than exit code 0.
Anything else do we need to know:
Related issues:
https://github.com/kubernetes/kubernetes/issues/24533
https://github.com/kubernetes/kubernetes/issues/30243
More...
My Yaml was:
apiVersion: batch/v1
kind: Job
metadata:
name: sphela-letsencrypt
spec:
template:
metadata:
labels:
run: sphela-letsencrypt
app: sphela
spec:
activeDeadlineSeconds: 1
imagePullSecrets:
- name: docker-registry-secret
containers:
- name: letsencrypt
image: gcr.io/sphela-153202/sphela-letsencrypt:v6
ports:
- containerPort: 80
imagePullPolicy: Always
command: [
"/usr/local/bin/encrypt-script.sh"
]
restartPolicy: Never
I set it to 1 just to see what would happen. I would have expected it to fail and never retry based on the documentation.
/usr/local/bin/encrypt-script.sh is a script that exits with an exit code of greater than 0.
The error was my fault, but it resulted in me exhausting my letsencrypt cert creation attempts for a week.
nm I noticed this:
Note that both the Job Spec and the Pod Template Spec within the Job have a field with the same name. Set the one on the Job.
And I had it on the spec instead of the job. Why are there two of them?
I am still seeing this issue in minikube version: v0.15.0. I am trying to create a job for functional/perf tests, but if it returns non-zero exit code, it just keeps restarting.
Yes @Shubham-Sakhuja-Bose this is by design. I think it's a bad design, but it's intentional. I've been told that if I don't want something to "run to completion" to not use a job. Jobs can only be perfectly written bug free code that always succeeds. ;P
@btipling thanks for the quick response! That's unfortunate, but pods it is!
Yeah my thing is to just run a pod that uses cron and avoid jobs and and kubernetes crons all together. I don't understand the kubernetes cron and job use case, but I love kubernetes and it's made by smart people so I'm sure there's a lot of thought that went into it snark aside.
Most helpful comment
Yeah my thing is to just run a pod that uses cron and avoid jobs and and kubernetes crons all together. I don't understand the kubernetes cron and job use case, but I love kubernetes and it's made by smart people so I'm sure there's a lot of thought that went into it snark aside.