Cert-manager: Ship NetworkPolicy in the helm chart

Created on 27 Feb 2018 · 14Comments · Source: jetstack/cert-manager

/kind feature

Just had this thought: What if someone has a NetworkPolicy that by default disables all the egress network. This way public DNS verification via google-dns will fail. Similarly, they may have Ingress rules that block all the incoming traffic unless it's explicitly allowed.

I'm talking about policies like
https://github.com/ahmetb/kubernetes-network-policy-recipes/blob/712388dccdf9f9cbb6b9683c4e7564b6569d5060/12-deny-all-non-whitelisted-traffic-from-the-namespace.md
and
https://github.com/ahmetb/kubernetes-network-policy-recipes/blob/712388dccdf9f9cbb6b9683c4e7564b6569d5060/03-deny-all-non-whitelisted-traffic-in-the-namespace.md

Maybe this is not a problem right now because nobody's writing Network Policies in kube-system right now (and the helm chart deploys to kube-system). But if the user chooses to deploy to another namespace, and that namespace has the bespoke default NP rules deployed, that'll be a problem.

Shipping network policies has no harm:

they can be specifically for cert-manager pods and won't apply to other workloads
if there's no network policy plugin in the cluster, NetworkPolicy object is still accepted to the API (but not used).

Food for thought.

aredeploy help wanted kinfeature lifecyclrotten prioritbacklog

Source

ahmetb

Most helpful comment

Apologies to bump an old issue, I'd like to share what I did to get around a particular NetworkPolicy challenge in case it helps another person:

I have NetworkPolicies set up in my default namespace with a default deny ingress policy (similar to this)

CertManager creates a cm-acme-http-solver pod&service in my default namespace. To allow LetsEncrypt to reach it, I created the following whitelist NetworkPolicy:

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: allow-certmanager-acme
spec:
  policyTypes:
  - Ingress
  podSelector: # allow the certmanager http server (for ACME protocol) to be reached from LetsEncrypt
    matchExpressions: # Look for the labels but ignore their contents
    - {key: certmanager.k8s.io/acme-http-domain, operator: Exists}
    - {key: certmanager.k8s.io/acme-http-token, operator: Exists}
  ingress: # allow from anywhere inside the cluster or on the internet
  - from: []

This matches the correct pod and allows incoming traffic. Further work could lock this down to LetsEncrypt's IP range(s).

philsparrow on 5 Jun 2018

👍3

All 14 comments

You are right network policy is not common in charts yet. But if you can write and PR some then they could be in the chart with a switch to create or not.

Keeping consistent with the rbac and serviceAccount convention, that would be:

networkPolicy:
  create: true

@munnerz I am not sure much network access is needed, except egress for the k8s API (which you don't need to permit in network policy I think?), egress for the Internet, and ingress from any ingress controllers in any namespaces.

You might need configuration to supply the labels used by ingress controllers in a particular cluster. Or else keep the policy really simple and just allow the port to the cluster.

whereisaaron on 28 Feb 2018

This is a low-priority item, probably doesn't need any action right now. Most users won't have Network Policies in the namespace cert-manager is deployed to.

You'd probably need a lot of policies to allow egress to k8s API, kube-dns (and/or other external dns resolvers), self-check might require some egress, too. For Ingress traffic, we'd probably want to expose the challenge endpoint to public.

ahmetb on 28 Feb 2018

Apologies to bump an old issue, I'd like to share what I did to get around a particular NetworkPolicy challenge in case it helps another person:

I have NetworkPolicies set up in my default namespace with a default deny ingress policy (similar to this)

CertManager creates a cm-acme-http-solver pod&service in my default namespace. To allow LetsEncrypt to reach it, I created the following whitelist NetworkPolicy:

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: allow-certmanager-acme
spec:
  policyTypes:
  - Ingress
  podSelector: # allow the certmanager http server (for ACME protocol) to be reached from LetsEncrypt
    matchExpressions: # Look for the labels but ignore their contents
    - {key: certmanager.k8s.io/acme-http-domain, operator: Exists}
    - {key: certmanager.k8s.io/acme-http-token, operator: Exists}
  ingress: # allow from anywhere inside the cluster or on the internet
  - from: []

This matches the correct pod and allows incoming traffic. Further work could lock this down to LetsEncrypt's IP range(s).

philsparrow on 5 Jun 2018

👍3

From my understanding, the above policy (thanks @philsparrow!) is all that is required for an ingress-restricted NetworkPolicy environment.

I guess one of these resources must be created in each namespace? Should cert-manager create a solver-specific network policy each time a validation is attempted? From what I can see, there is no way for us to blanket allow ingress to those pods, meaning we need a NetworkPolicy resource in every namespace we want to solve challenges in.

Creating NetworkPolicy resources also seems like a fairly privileged operation, reserved for network admins etc.

@ahmetb got any idea how you'd like to see us support this? Is simply creating NetworkPolicy resources (similar to how we create a pod/service/ingress resource) on each validation acceptable/sensible?

munnerz on 6 Jun 2018

I'm proposing to add an "egress" NetworkPolicy to allow cert-manager pods to access public internet.

If cert-manager is deployed to a namespace that has outbound internet traffic (or, any traffic) disabled by default, the cluster admin needs to sit down and write a policy whitelisting cert-manager.

I currently don't see the need for an "ingress" policy?

ahmetb on 6 Jun 2018

+1 on this.

how you'd like to see us support this

@munnerz We need ingress network policy to be created otherwise http challenge wouldn't work. Moving acmesolver to the cert-manager namespace would help, so we can remove default deny in that namespace - which we cannot do in the ingress namespace.

jcmoraisjr on 15 Jun 2018

Moving the acmesolver pod to a new namespace is not simple, as it is not
possible to reference a service in another namespace using an Ingress
resource.

In the past, kube-lego manually managed endpoint resources and created a
service with no selector to achieve this, but this can cause problems with
health checks as ingress controllers are unable to discover the correct
health check details for the service (eg GCLB, which does its own health
checking)

On Fri, 15 Jun 2018 at 19:05, Joao Morais notifications@github.com wrote:

+1 on this.

how you'd like to see us support this

@munnerz https://github.com/munnerz We need ingress network policy to
be created otherwise http challenge wouldn't work. Moving acmesolver to the
cert-manager namespace would help, so we can remove default deny in that
namespace - which we cannot do in the ingress namespace.

—
You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
https://github.com/jetstack/cert-manager/issues/362#issuecomment-397699977,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAMbP0sqlgsaAIVA0vhbgw20539qpubqks5t8_dpgaJpZM4SVpBa
.

munnerz on 15 Jun 2018

it is not possible to reference a service in another namespace

The idea is to move cert-manager pod, svc and also ingress to the same namespace. This could be behind a --command-line option of course.

jcmoraisjr on 15 Jun 2018

But for the GCE ingress controller we have to modify an existing resource,
so we cannot simply move it

On Fri, 15 Jun 2018 at 19:16, Joao Morais notifications@github.com wrote:

it is not possible to reference a service in another namespace

The idea is to move cert-manager pod, svc and also ingress to the same
namespace. This could be behind a --command-line option of course.

—
You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
https://github.com/jetstack/cert-manager/issues/362#issuecomment-397702753,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAMbP05EMKTxjJPvdzsyHw_Fb0xft6JOks5t8_oHgaJpZM4SVpBa
.

munnerz on 15 Jun 2018

Because of that using a command-line. We use a bare-metal deployment so this is an option. Anyway I agree a better option is to create a network policy resource. Without at least one of this options I cannot see cert-manager working on an environment that enforces ingress rules.

jcmoraisjr on 15 Jun 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle stale

retest-bot on 8 May 2019

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle rotten
/remove-lifecycle stale

retest-bot on 7 Jun 2019

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to jetstack.
/close

retest-bot on 8 Jul 2019

@retest-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to jetstack.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.