Aws-load-balancer-controller: readiness gate conditionType too long

Created on 7 Apr 2020 · 9Comments · Source: kubernetes-sigs/aws-load-balancer-controller

we're starting to rollout the readiness gates on our cluster and have just come across a limitation with the way, readiness gates are set:

kubernetes imposes a 63 character limit on the length of readiness gates. Since the required string consists of target-health.alb.ingress.k8s.aws/<name>_<name>_<port> this is a problem for services, which rely on the name limit of 63 chars for service and ingress names.

from our logs:

Deployment.apps "some-lengthy-service-name-that-is-not-too-long" is invalid: spec.template.spec.readinessGates[0].conditionType: Invalid value: "target-health.alb.ingress.k8s.aws/some-lengthy-service-name-that-is-not-too-long_some-lengthy-service-name-that-is-not-too-long_http": name part must be no more than 63 characters

Source

nirnanaaa

Most helpful comment

@SaranBalaji90 not sure about the prefix, from our observation it's not taken into count somehow

@nirnanaaa Thanks. I updated my comment. Looks like name part in the key can only be upto 63 character.

SaranBalaji90 on 8 Apr 2020

👍2

All 9 comments

As per the documentation https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-readiness-gate -> https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set

prefix/name can be upto 316 in total. prefix can be upto 253 and name part can be upto 63 character.

SaranBalaji90 on 7 Apr 2020

@SaranBalaji90 not sure about the prefix, from our observation it's not taken into count somehow

nirnanaaa on 8 Apr 2020

@SaranBalaji90 not sure about the prefix, from our observation it's not taken into count somehow

@nirnanaaa Thanks. I updated my comment. Looks like name part in the key can only be upto 63 character.

SaranBalaji90 on 8 Apr 2020

👍2

How the name is currently being generated (@devkid why do we need the service name part? shouldn't ingress+backend port be enough? Ideally only the ingress name, since that's also capped to a certain number of characters AFAIK):

    return api.PodConditionType(fmt.Sprintf(
        "target-health.alb.ingress.k8s.aws/%s_%s_%s",
        ingress.Name,
        backend.ServiceName,
        backend.ServicePort.String(),
    ))

Some ideas how it might look like:

    return api.PodConditionType(fmt.Sprintf(
        "target-health.alb.ingress.k8s.aws/%s",
        ingress.Name,
    ))

    return api.PodConditionType(fmt.Sprintf(
        "target-health.alb.ingress.k8s.aws/load-balancer-tg-ready",
    ))

I've been wondering what the implications would be if we would just condense the syntax, or even make it a static string. If a pod is registered in multiple TGs it would require some internal logic to wait until the pod is healthy in all TGs before setting it to true (if we were using a static setting)

The folks over at the blue side are also using a simple static string: https://github.com/kubernetes/ingress-gce/blob/f2c8f40bf1968032b6a267632b5a0e767b9d0467/pkg/neg/types/shared/types.go#L20

nirnanaaa on 9 Apr 2020

👍1

Each tuple (ingress name, backend name, port) maps 1:1 to a target group – it was easy to implement this way (only ingress+backend port alone would be ambiguous – what if you have different backends with the same port in the same ingress?) and there is more logic necessary to aggregate the condition status from multiple target groups.

Rough idea for implementation for either static condition type or one that only includes the ingress name:

have a single global go routine that collects "pod has become healthy in target group" and "pod has been removed from target group" events through a chan
have reconcilePodCondition send the "pod has become healthy in target group" event through the chan instead of it directly adding/setting the condition status
have RemovePodConditions send the "pod has been removed from target group" events through the chan instead of directly removing the condition status
the global go routine can react on the events and only make the condition(s) true if the pod is ready in all target groups

devkid on 10 Apr 2020

Ingress Name could longer than 63 characters.
Please ignore this comment.

@devkid

both reconcilePodCondition and RemovePodConditions have a filter by t.Backend.
That means the pods selected should only contains within the t.Backend.

pods, err := c.endpointResolver.ReverseResolve(t.Ingress, t.Backend, targets)
    if err != nil {
        return err
    }

The idea @nirnanaaa suggested should be fine.

    return api.PodConditionType(fmt.Sprintf(
        "target-health.alb.ingress.k8s.aws/%s",
        ingress.Name,
    ))

Do the following filtering in both podHasReadinessGate and podConditionForReadinessGate should have backward compatible. Please correct me if I am wrong.

func podHasReadinessGate(pod *api.Pod, conditionType api.PodConditionType) bool {
    var lenPrefix = len(ConditionTypeTargetHealthPrefix)
    ingressName := ""
    strConditionType := string(conditionType)
    if strings.HasPrefix(strConditionType, ConditionTypeTargetHealthPrefix) {
        ingressName = strConditionType[lenPrefix:]
    }

    for _, rg := range pod.Spec.ReadinessGates {
        if rg.ConditionType == conditionType {
            return true
        }

        //In version 1.1.6, the conditionType is "ingress_service_port"
        //But it hits the 63 characters limitation in k8s, it could be ingress only
        conditionDotType := string(rg.ConditionType)
        if strings.HasPrefix(conditionDotType, ConditionTypeTargetHealthPrefix) {
            suffix := conditionDotType[lenPrefix:]
            if len(suffix) > 0 {
                var array = strings.Split(suffix, "_")
                if len(array) == 3 {
                    if ingressName == array[0] {
                        //Match ingress only
                        return true
                    }
                }
            }
        }
    }
    return false
}

shaoxt on 30 Apr 2020

Ingress name can be upto 253 characters long so the only option is static string option suggested by @nirnanaaa. Could we just use the static string and not worry about aggregating the condition status from multiple target groups?

sandeepbhojwani on 1 May 2020

@devkid @nirnanaaa Static string makes thing easier. One pod in multiple ingresses or target groups are rare cases. Even In this case, the pod becomes ready when it registered into one of the ingress or target group. People can always have a way to split the pod into two different pods when they think attaching one pod on multiple target groups are risky.

Like the idea of static string.

    return api.PodConditionType(fmt.Sprintf(
        "target-health.alb.ingress.k8s.aws/load-balancer-tg-ready",
    ))

shaoxt on 1 May 2020

@shaoxt I don't really mind a static string tbh. It reduces complexity, so all in for that

nirnanaaa on 1 May 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

[Feature Request] Support for IngressClassName

rdubya16 · 4Comments

Possibility to disable periodic rule check/modification

JakubJecminek · 5Comments

Support GRPCs protocol

hieu29791 · 4Comments

alb.ingress.kubernetes.io/connection-idle-timeout has no effect

rootd00d · 4Comments

Configuration for 0 downtime deployments

benwilson512 · 5Comments