Jx: TLS seems to be broken on applications when using boot

Created on 3 Sep 2019  路  32Comments  路  Source: jenkins-x/jx

Summary

  • Enable TLS when installing with boot via the cert-manager and external DNS
  • Create an application with either create spring or create quickstart and deploy it with Jenkins X
  • The certificate for the application public endpoint form staging environment seems to be invalid

The application ingress resource seems to still have the expose controller annotations:

apiVersion: v1
items:
- apiVersion: extensions/v1beta1
  kind: Ingress
  metadata:
    annotations:
      fabric8.io/generated-by: exposecontroller
      kubernetes.io/ingress.class: nginx
      kubernetes.io/tls-acme: "true"
    creationTimestamp: 2019-09-03T08:13:40Z
    generation: 1
    labels:
      provider: fabric8
    name: bdd-spring-1567497978
    namespace: jx-staging
    ownerReferences:
    - apiVersion: v1
      kind: Service
      name: bdd-spring-1567497978
      uid: ba295232-ce22-11e9-bb9b-42010a84003c
    resourceVersion: "8133"
    selfLink: /apis/extensions/v1beta1/namespaces/jx-staging/ingresses/bdd-spring-1567497978
    uid: bcc72597-ce22-11e9-bb9b-42010a84003c
  spec:
    rules:
    - host: bdd-spring-1567497978.jx-staging.boot.bdd.jenkins-x.rocks
      http:
        paths:
        - backend:
            serviceName: bdd-spring-1567497978
            servicePort: 80
    tls:
    - hosts:
      - bdd-spring-1567497978.jx-staging.boot.bdd.jenkins-x.rocks
      secretName: tls-bdd-spring-1567497978
  status:
    loadBalancer:
      ingress:
      - ip: 
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

No cert-manger issuer seems to be installed in the stating namespace. The cert-manger fails with the following error when trying to acquire the certificate for newly deployed application:

I0903 08:13:40.282576       1 base_controller.go:193] cert-manager/controller/ingress-shim "level"=0 "msg"="finished processing work item" "key"="jx-staging/bdd-spring-1567497978"
I0903 08:14:04.218156       1 base_controller.go:187] cert-manager/controller/ingress-shim "level"=0 "msg"="syncing item" "key"="jx-staging/bdd-spring-1567497978"
I0903 08:14:04.218406       1 sync.go:77] cert-manager/controller/ingress-shim "level"=0 "msg"="failed to determine issuer to be used for ingress resource" "resource_kind"="Ingress" "resource_name"="bdd-spring-1567497978" "resource_namespace"="jx-staging"

Steps to reproduce the behavior

Expected behavior

A valid certificate should be acquired for an application deployed in the staging or production environments.

Actual behavior

Jx version

The output of jx version is:

COPY OUTPUT HERE

Jenkins type

  • [ x] Serverless Jenkins X Pipelines (Tekton + Prow)
  • [ ] Classic Jenkins

Kubernetes cluster

Operating system / Environment

areapplications areboot arefox areingress aresecurity estimatM kindiscovery lifecyclrotten prioritcritical

Most helpful comment

Bumped priority.

All 32 comments

@ccojocar is there a workaround?

There are a few workarounds:
1) disable TLS in the environment git repository after boot installation by setting the http: "true" and tlsacme: "false" e.g. for staging environment https://github.com/jenkins-x/environment-tekton-weasel-staging/blob/07a534f7d30cb47317c7dcded628f0e28fbbad31/env/values.yaml#L18
2) use something like https://github.com/jenkins-x-charts/kubernetes-replicator to copy the wildcard certificate from jx (dev environment) namespace into jx-staging and jx-production namespaces and then modify the ingress annotation to use this secret as a certificate. This approach is not quite recommended from security point of view.
3) create cert-manager issuer and certificate into stating/production git repository after boot installation following this example https://github.com/jenkins-x/jenkins-x-boot-config/blob/master/systems/acme/templates/cert-manager-prod-issuer.yaml and respectively https://github.com/jenkins-x/jenkins-x-boot-config/blob/master/systems/acme/templates/cert-manager-prod-certificate.yaml and add an ingress-config map in the same repository similar with https://github.com/jenkins-x/environment-tekton-weasel-dev/blob/master/env/templates/ingress-config-configmap.yaml with proper values for domain, issuer name etc.

I would recommend option 1 and let the users to take the responsibility to secure their applications deployed with Jenkins X until we have a fix on our side and everything works automatically.

I'll also extend the environment configuration in the jx-requirements file such that it will allow to define custom ingress configuration per environment, and then we can generate the cert-manager resources from templates per environment.

See also #4096 - @ccojocar @cagiti can you check if they are dupes and close one out?

There are a few workarounds:

  1. disable TLS in the environment git repository after boot installation by setting the http: "true" and tlsacme: "false" e.g. for staging environment https://github.com/jenkins-x/environment-tekton-weasel-staging/blob/07a534f7d30cb47317c7dcded628f0e28fbbad31/env/values.yaml#L18
  2. use something like https://github.com/jenkins-x-charts/kubernetes-replicator to copy the wildcard certificate from jx (dev environment) namespace into jx-staging and jx-production namespaces and then modify the ingress annotation to use this secret as a certificate. This approach is not quite recommended from security point of view.
  3. create cert-manager issuer and certificate into stating/production git repository after boot installation following this example https://github.com/jenkins-x/jenkins-x-boot-config/blob/master/systems/acme/templates/cert-manager-prod-issuer.yaml and respectively https://github.com/jenkins-x/jenkins-x-boot-config/blob/master/systems/acme/templates/cert-manager-prod-certificate.yaml and add an ingress-config map in the same repository similar with https://github.com/jenkins-x/environment-tekton-weasel-dev/blob/master/env/templates/ingress-config-configmap.yaml with proper values for domain, issuer name etc.

I would recommend option 1 and let the users to take the responsibility to secure their applications deployed with Jenkins X until we have a fix on our side and everything works automatically.

I think option 3 is best, but we'd need to progress on operating the staging and production environments to use boot. We'd also need to provide an ingress template for the preview, staging and production environments. For the interim we should go with option 1 and quickly follow up with option 3.

Hello, is there any update on this?

Hi! We are also eagerly awaiting any progress on this one!

Hi! I'm also looking forward to the fix.

Got stopped by this also, default TLS for all permanent and preview environments would save us a lot of configuration!

It would also be nice if the default staging and production keys in the requirements file were optional, just in case you want different environments in your workflow. Right now it always tries to use the staging and production keys during jx boot.

Bumped priority.

I am also having this problem.

is there any idea when this issue will be resolved? jx upgrade ingress is deprecated. I am following option 3 from the workarounds above but it would be good to know when this won't be necessary.

after a bit of looking it seems kubernetes replicator is partly setup in the secrets in the staging and production envs with this

{{- if .Values.expose.production }}
    replicator.v1.mittwald.de/replicate-from: jx/tls-{{ .Values.expose.config.domain | replace "." "-" }}-p
{{- else }}
    replicator.v1.mittwald.de/replicate-from: jx/tls-{{ .Values.expose.config.domain | replace "." "-" }}-s
{{- end }}
{{- if .Values.expose.production }}
  name: "tls-{{ .Values.expose.config.domain | replace "." "-" }}-p"
{{- else }}
  name: "tls-{{ .Values.expose.config.domain | replace "." "-" }}-s"
{{- end }}

but it's not finding the secret in the jx namespace. Is this because certmanager doesn't allow adding annotations to the source secret?

I would just like to try and understand what the problem is so when I get it working with a workaround I don't break things with future upgrades.

@deanesmith Any idea when this will be prioritized?

Not sure if this is of any use but I did a bit of work last week to get this working for a user on slack, I added some steps folks can run to get automated TLS in staging and prod namespaces when using a dns challenge https://kubernetes.slack.com/archives/C9MBGQJRH/p1578574067137500

tl;dr steps from comment above

jx create cluster gke --skip-installation
jx boot # making the git repos public as I can't use private on a free account
jx create domain gke -d rawlings-demo.com
# updated ingress section in jx-requirements.yml
jx boot
jx add app jx-app-replicator -v 1.0.16 --repository https://storage.googleapis.com/chartmuseum.jenkins-x.io
jx step replicate secret "tls-*" -r jx-staging -r jx-production --create-namespace
jx create quickstart

Will that work for previews and devpods?

I'm currently using

DOMAIN="your_domain"

kubectl patch deployment -n kube-system jxing-nginx-ingress-controller --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--default-ssl-certificate=jx/tls-$DOMAIN-p"}]'

which seems to work quite well and catch anything with a broken/missing cert. I wasn't able to get it stable and working for previews and devpods when I was trying replicator a while back.

@rawlingsj I tried that approach manually and found that certificates were not being finalised because there was no Issuer in the namespace. I added an Issuer and have now hit no configured challenge solvers can be used for this challenge.

Tried recreating the certificates manually, but get the same error.

@tdcox @rawlingsj same here on my application's Ingress: Could not determine issuer for ingress due to bad annotations: failed to determine issuer name to be used for ingress resource

Originally from Slack thread

I ended up getting this working, but I had to manually copy the keys over. For some reason, Secret replication is not functioning on my cluster.

I ended up adding my own CM Issuers in both _Staging_ and _Production_. Currently, these are just based on letsencrypt-prod.

Note: I changed the Issuers name from letsencrypt-prod to letsencrypt-prod-stg while debugging, but as everything is silo鈥檇 in Kubernetes namespaces it is likely unnecessary.

apiVersion: cert-manager.io/v1alpha2
kind: Issuer
metadata:
  annotations:
    jenkins.io/chart: acme
    jenkins.io/chart-app-version: 0.0.12
  labels:
    jenkins.io/chart-release: acme
    jenkins.io/namespace: jx-staging
    jenkins.io/version: "12"
  name: letsencrypt-prod-stg
  namespace: jx-staging
  selfLink: /apis/cert-manager.io/v1alpha2/namespaces/jx-staging/issuers/letsencrypt-prod-stg
spec:
  acme:
    email: TLS_EMAIL
    privateKeySecretRef:
      name: letsencrypt-prod-stg
    server: https://acme-v02.api.letsencrypt.org/directory
    solvers:
    - dns01:
        clouddns:
          project: GCP_PROJECT
          serviceAccountSecretRef:
            key: credentials.json
            name: external-dns-gcp-sa
      selector:
        dnsNames:
        - '*.MYDOMAIN.com'
        - MYDOMAIN.com

Because I am using _CloudDNS_ on _GCP_, I also ended up needing to also replicate (manually) the external-dns-gcp-sa Secret from _jx_ to _jx-staging_ and _jx-production_, so that CM could fulfil Orders. Here is the error to look out for:

cert-manager/controller/challenges "msg"="re-queuing item due to error processing" "error"="error getting clouddns service account: secret \"external-dns-gcp-sa\" not found" "key"="jx-staging/tls-PROJECT-NAMESPACE-DOMAIN-p-RAND-RAND-RAND"

Last of all, it鈥檚 good to ramp up on Cert Manager. In addition to following the CertManager namespace with _kail_. This post was super handy, as a lot over important debugging information is only available if you kubectl describe [Object] at the object level.

You can actually find error messages in each of these, like so:
kubectl get certificaterequest
kubectl describe certificaterequest X
kubectl get order
kubectl describe order X
kubectl get challenge
kubectl describe challenge X

Looks like jx upgrage ingress is being decommissioned soon which breaks the workaround I had been using https://github.com/jenkins-x/jx/issues/6107

After jx boot with tls enabled in jx-requirements.txt, I am seeing since days, certificate is not even ready on dev
Status: Conditions: Last Transition Time: 2020-03-02T15:53:00Z Message: Waiting for CertificateRequest "tls-jenkinsx-dev-xxx-xxx-xx-p-2276452093" to complete Reason: InProgress Status: False Type: Ready Events: <none>

I'll also extend the environment configuration in the jx-requirements file such that it will allow to define custom ingress configuration per environment, and then we can generate the cert-manager resources from templates per environment.

@ccojocar is that done? We need that urgently as currently blocked because domain and sub domain are with different providers. Jenkins X without security is not viable option for many companies.

@srehmanproov You can check with @daveconde or @deanesmith on slack. I don't think is done.

@srehmanproov You can check with @daveconde or @deanesmith on slack. I don't think is done.

thanks @ccojocar

Has anyone gotten a chance to confirm that https://github.com/jenkins-x-charts/jxboot-resources/pull/32 fixes this issue?

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://jenkins-x.io/community.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Provide feedback via https://jenkins-x.io/community.
/lifecycle rotten

Will that work for previews and devpods?

I'm currently using

DOMAIN="your_domain"

kubectl patch deployment -n kube-system jxing-nginx-ingress-controller --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--default-ssl-certificate=jx/tls-$DOMAIN-p"}]'

which seems to work quite well and catch anything with a broken/missing cert. I wasn't able to get it stable and working for previews and devpods when I was trying replicator a while back.

This configure could be add to boot-config: systems/jxing/values.tmpl.yaml as below:

nginx-ingress:
  controller:
    extraArgs:
      default-ssl-certificate: jx/tls-${your_domain_name}-p

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Provide feedback via https://jenkins-x.io/community.
/close

@jenkins-x-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Provide feedback via https://jenkins-x.io/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository.

Was this page helpful?
0 / 5 - 0 ratings