Describe the bug
Whenever creating a deployment that consumes a service definition that export the same port with different protocols, the DaemonSet (svclb) does not honor the UDP configuration and there only the first service (even if the service is set to UDP) will work and the second service stays pending.
0/3 nodes are available: 2 node(s) didn't match node selector, 3 node(s) didn't have free ports for the requested pod ports.
To Reproduce
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: bind-deployment
labels:
app: bind
category: basic-services
environment: production
level: mgmt
required: "true"
spec:
replicas: 1
selector:
matchLabels:
app: bind
template:
metadata:
labels:
app: bind
category: basic-services
environment: production
level: mgmt
required: "true"
spec:
imagePullPolicy: Always
volumes:
- name: bind-nfs-data
nfs:
server: 192.168.169.51
path: /volume1/docker_my_apps/bind_data/
containers:
- name: bind
image: tchellomello/docker-bind:latest
imagePullPolicy: Always
resources:
limits:
memory: "512Mi"
cpu: "100m"
volumeMounts:
- name: bind-nfs-data
mountPath: /etc/bind
apiVersion: v1
kind: Service
metadata:
name: bind-udp-service
metadata:
labels:
app: bind
category: basic-services
environment: production
level: mgmt
required: "true"
annotations:
metallb.universe.tf/allow-shared-ip: metal-lb-ip-space
spec:
type: LoadBalancer
loadBalancerIP: 192.168.168.53 # unifi-svc.tatu.home (metallb)
ports:
- name: bind-udp
port: 53
protocol: UDP
selector:
app: bind
apiVersion: v1
kind: Service
metadata:
name: bind-tcp-service
metadata:
labels:
app: bind
category: basic-services
environment: production
level: mgmt
required: "true"
annotations:
metallb.universe.tf/allow-shared-ip: metal-lb-ip-space
spec:
type: LoadBalancer
loadBalancerIP: 192.168.168.53 # unifi-svc.tatu.home (metallb)
ports:
- name: bind-tcp
port: 53
protocol: TCP
- name: rndc
port: 953
protocol: TCP
selector:
app: bind
svclb consumed a TCP port instead of UDP and therefore, causing the TCP service to be pending as that is the one expected to consume the TCP: (see status 3 running)kubectl describe daemonsets svclb-bind-udp-service
Name: svclb-bind-udp-service
Selector: app=svclb-bind-udp-service
Node-Selector: <none>
Labels: cattle.io/creator=norman
objectset.rio.cattle.io/hash=4578976818dfe0e7bdb871965339b5d38c8eb25f
svccontroller.k3s.cattle.io/nodeselector=false
Annotations: deprecated.daemonset.template.generation: 1
objectset.rio.cattle.io/applied:
[....SNIP...]
objectset.rio.cattle.io/id: svccontroller
objectset.rio.cattle.io/inputid: 1f4b7d1c997e87b4a4c97c053a280db9144929ff
objectset.rio.cattle.io/owner-gvk: /v1, Kind=Service
objectset.rio.cattle.io/owner-name: bind-udp-service
objectset.rio.cattle.io/owner-namespace: default
Desired Number of Nodes Scheduled: 3
Current Number of Nodes Scheduled: 3
Number of Nodes Scheduled with Up-to-date Pods: 3
Number of Nodes Scheduled with Available Pods: 3
Number of Nodes Misscheduled: 0
Pods Status: 3 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=svclb-bind-udp-service
svccontroller.k3s.cattle.io/svcname=bind-udp-service
Containers:
lb-port-53:
Image: rancher/klipper-lb:v0.1.1
Port: 53/TCP
Host Port: 53/TCP <----- HOST PORT SHOULD BE UDP
Environment:
SRC_PORT: 53
DEST_PROTO: UDP <-- correctly using UDP
DEST_PORT: 53
DEST_IP: 10.43.29.127
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 18m daemonset-controller Created pod: svclb-bind-udp-service-t5xdm
Normal SuccessfulCreate 18m daemonset-controller Created pod: svclb-bind-udp-service-pzqms
Normal SuccessfulCreate 18m daemonset-controller Created pod: svclb-bind-udp-service-gks4t
Daemonset we can observe it is Waiting:kubectl describe daemonsets svclb-bind-tcp-service 15:24:46
Name: svclb-bind-tcp-service
Selector: app=svclb-bind-tcp-service
Node-Selector: <none>
Labels: cattle.io/creator=norman
objectset.rio.cattle.io/hash=19934d06b31db43bb6d44d6d8985fc1f94c72e7b
svccontroller.k3s.cattle.io/nodeselector=false
Annotations: deprecated.daemonset.template.generation: 1
objectset.rio.cattle.io/applied: [..SNIP..]
objectset.rio.cattle.io/id: svccontroller
objectset.rio.cattle.io/inputid: 07ac51da615fd47e90fdab0304f1a241ed5b73db
objectset.rio.cattle.io/owner-gvk: /v1, Kind=Service
objectset.rio.cattle.io/owner-name: bind-tcp-service
objectset.rio.cattle.io/owner-namespace: default
Desired Number of Nodes Scheduled: 3
Current Number of Nodes Scheduled: 3
Number of Nodes Scheduled with Up-to-date Pods: 3
Number of Nodes Scheduled with Available Pods: 0
Number of Nodes Misscheduled: 0
Pods Status: 0 Running / 3 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=svclb-bind-tcp-service
svccontroller.k3s.cattle.io/svcname=bind-tcp-service
Containers:
lb-port-53:
Image: rancher/klipper-lb:v0.1.1
Port: 53/TCP
Host Port: 53/TCP <--- same TCP, therefore conflict
Environment:
SRC_PORT: 53
DEST_PROTO: TCP
DEST_PORT: 53
DEST_IP: 10.43.40.190
Mounts: <none>
lb-port-953:
Image: rancher/klipper-lb:v0.1.1
Port: 953/TCP
Host Port: 953/TCP
Environment:
SRC_PORT: 953
DEST_PROTO: TCP
DEST_PORT: 953
DEST_IP: 10.43.40.190
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 20m daemonset-controller Created pod: svclb-bind-tcp-service-ldntc
Normal SuccessfulCreate 20m daemonset-controller Created pod: svclb-bind-tcp-service-7qpnp
Normal SuccessfulCreate 20m daemonset-controller Created pod: svclb-bind-tcp-service-twlbv
Expected behavior
The svclb should honor the protocol specified in the service.
Additional context
To fix the problem, basically I edit the svclb-bind-udp-service and adjust its protocol:
kubectl edit daemonsets svclb-bind-udp-service
daemonset.extensions/svclb-bind-udp-service edited
kubectl describe daemonsets svclb-bind-udp-service
Name: svclb-bind-udp-service
Selector: app=svclb-bind-udp-service
Node-Selector: <none>
Labels: cattle.io/creator=norman
objectset.rio.cattle.io/hash=4578976818dfe0e7bdb871965339b5d38c8eb25f
svccontroller.k3s.cattle.io/nodeselector=false
Annotations: deprecated.daemonset.template.generation: 2
objectset.rio.cattle.io/applied:
H4sIAAAAAAAA/4xU0W7jNhD8lWKfKcWKTpFFoA/F5R6C9hLBdvpyCIIVuY7ZUKRAUuoZBv+9oJImTntO7tG7M6PlzMAHeFRGAodLpN6aNQVggIP6k5xX1gAHHAZ/NhXAoKeAEgMCP4...
objectset.rio.cattle.io/id: svccontroller
objectset.rio.cattle.io/inputid: 678394ca1bd3453caf2576793215aca371babb0b
objectset.rio.cattle.io/owner-gvk: /v1, Kind=Service
objectset.rio.cattle.io/owner-name: bind-udp-service
objectset.rio.cattle.io/owner-namespace: default
Desired Number of Nodes Scheduled: 3
Current Number of Nodes Scheduled: 3
Number of Nodes Scheduled with Up-to-date Pods: 3
Number of Nodes Scheduled with Available Pods: 3
Number of Nodes Misscheduled: 0
Pods Status: 3 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=svclb-bind-udp-service
svccontroller.k3s.cattle.io/svcname=bind-udp-service
Containers:
lb-port-53:
Image: rancher/klipper-lb:v0.1.1
Port: 53/UDP
Host Port: 53/UDP <--- looks good now
Environment:
SRC_PORT: 53
DEST_PROTO: UDP
DEST_PORT: 53
DEST_IP: 10.43.29.127
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 29m daemonset-controller Created pod: svclb-bind-udp-service-t5xdm
Normal SuccessfulCreate 29m daemonset-controller Created pod: svclb-bind-udp-service-pzqms
Normal SuccessfulCreate 29m daemonset-controller Created pod: svclb-bind-udp-service-gks4t
Normal SuccessfulDelete 30s daemonset-controller Deleted pod: svclb-bind-udp-service-pzqms
Normal SuccessfulCreate 28s daemonset-controller Created pod: svclb-bind-udp-service-hc25g
Normal SuccessfulDelete 26s daemonset-controller Deleted pod: svclb-bind-udp-service-t5xdm
Normal SuccessfulCreate 17s daemonset-controller Created pod: svclb-bind-udp-service-qlg4l
Normal SuccessfulDelete 15s daemonset-controller Deleted pod: svclb-bind-udp-service-gks4t
Normal SuccessfulCreate 13s daemonset-controller Created pod: svclb-bind-udp-service-gnqvr
And per consequence, the svclb-bind-tcp-service is scheduled and running as expected
kubectl describe daemonsets svclb-bind-tcp-service
Name: svclb-bind-tcp-service
Selector: app=svclb-bind-tcp-service
Node-Selector: <none>
Labels: cattle.io/creator=norman
objectset.rio.cattle.io/hash=19934d06b31db43bb6d44d6d8985fc1f94c72e7b
svccontroller.k3s.cattle.io/nodeselector=false
Annotations: deprecated.daemonset.template.generation: 1
objectset.rio.cattle.io/applied:
H4sIAAAAAAAA/8yUz27jNhDGX6WYM6VYsR1HBHookj0E7SaC7e1lEQQjchyzoUiBHKlrGHr3gko2cbZxdoH2sEfNn0/D3zfkHh6M0yDhEqnxbkUMArA1f1KIxjuQgG0bT/oCBDTEqJ...
objectset.rio.cattle.io/id: svccontroller
objectset.rio.cattle.io/inputid: 034b2843473970495d58fccde588d07c2df5e80c
objectset.rio.cattle.io/owner-gvk: /v1, Kind=Service
objectset.rio.cattle.io/owner-name: bind-tcp-service
objectset.rio.cattle.io/owner-namespace: default
Desired Number of Nodes Scheduled: 3
Current Number of Nodes Scheduled: 3
Number of Nodes Scheduled with Up-to-date Pods: 3
Number of Nodes Scheduled with Available Pods: 3
Number of Nodes Misscheduled: 0
Pods Status: 3 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=svclb-bind-tcp-service
svccontroller.k3s.cattle.io/svcname=bind-tcp-service
Containers:
lb-port-53:
Image: rancher/klipper-lb:v0.1.1
Port: 53/TCP
Host Port: 53/TCP
Environment:
SRC_PORT: 53
DEST_PROTO: TCP
DEST_PORT: 53
DEST_IP: 10.43.40.190
Mounts: <none>
lb-port-953:
Image: rancher/klipper-lb:v0.1.1
Port: 953/TCP
Host Port: 953/TCP
Environment:
SRC_PORT: 953
DEST_PROTO: TCP
DEST_PORT: 953
DEST_IP: 10.43.40.190
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 30m daemonset-controller Created pod: svclb-bind-tcp-service-ldntc
Normal SuccessfulCreate 30m daemonset-controller Created pod: svclb-bind-tcp-service-7qpnp
Normal SuccessfulCreate 30m daemonset-controller Created pod: svclb-bind-tcp-service-twlbv
Starting the service with --no-deploy=servicelb did the job. Maybe it was a conflict was it does not make if I want to use metallb instead. Closing it for now.
The problem is unfortunately still present for people who wish to use the bundled klipper lb instead of metallb. Could this issue be reopened?
The problem is in https://github.com/rancher/k3s/blob/master/pkg/servicelb/controller.go#L322 ff.
There should be a line passing port.Protocol to the core.ContainerPort.
Just ran into this as well, thanks for the fix - hopefully it'll be deployed soon.
I am manually editing the service for now :)
Unless I'm mistaken should be fixed as a part of https://github.com/rancher/k3s/pull/1185
Issue resolved with v1.17.0+k3s.1
Most helpful comment
Just ran into this as well, thanks for the fix - hopefully it'll be deployed soon.
I am manually editing the service for now :)