Does this involve preserving the visibility of the Source IP of the TCP connection that hits Pod which is behind a Service with an external IP address? Similar to what's discussed in https://github.com/kubernetes/kubernetes/issues/10921 ?
Yes, this is exactly that feature, sorry I haven't had time to convert the design proposal from Google doc into Markdown.
@girishkalele great to know. Can I ask a couple of questions for clarification?
We're operating Kubernetes for non-HTTP traffic (gRPC, and just plain TCP) on GCE using L3 Load Balancing (Network LB). We direct our L3 LBs to a bunch of Frontend Instances running kube-proxy
. We rely on GCE network routes (programmed through Flannel) to direct the traffic to the Pod's IP address on a Worker Instance.
Ideally, what we'd like is the Frontend Instance kube-proxy
would substitute the IP DST IP and port to point to the Pod's IP and port using DNAT
. Then the GCE network/flannel stuff to do its magic and have the traffic end up on the Worker Instance and the Pod IP:port. Once our service running in the Pod responds, the kube-proxy
on the Worker Instance would SNAT
rewrite the SRC IP and port (which is the private Pod IP:port) to be the IP:port of the External service and drop it on the GCE network for egress. This would mean we get Direct Server Return (DSR) for free.
Is that roughly what you guys are thinking of doing?
@mwitkow
No...what we are planning is to eliminate the double hop by using GCE load balancer health checks. Instances/VMs that host endpoints in that service will respond healthy, and all other instances will respond unhealthy - thus, the GCE LB will split traffic only across nodes that host the service in local pods.
Does that mean that Source IP won't be preserved on non-GCE cloud providers?
Sorry, I should have said "Cloud Load Balancer" instead of GCE.
Source IP preservation will also be possible on other cloud providers, we just need to program the health checks using the provider's API - this will be an update in the pkg/cloudprovider/
Note:
The new feature will be opt-in by adding an annotation in the service spec; the old LB behavior will still be the default. For cloud provider packages that are not updated, the annotation will be ignored. This will allow the contributors to catch up in subsequent releases without requiring a hard change-over for all providers in one release.
@girishkalele ok, that makes sense. But that means that the MASQUARADE
step needs to go away (and be replaced with SNAT
+DNAT
I presume?), otherwise the service in the Pod won't see the SRC ip.
This is very close to what I am working on for node-local services, in that kube-proxy
will have to go from a node-agnostic approach, where its iptables rules look the same on all nodes, to a node-aware one, where the rules and behaviour are tailored to each host.
@girishkalele, how do you plan on plumbing the node's IP through? It can be tricky on some cloud providers, namely AWS. And speaking of AWS, if you talk HTTP and X-Forwarded-For works for you, then you can already get the original source IP through the L7 annotations I implemented: https://github.com/kubernetes/kubernetes.github.io/blob/a135f0a717803b35ec80563768d5abb4b45ac4d1/docs/user-guide/services/index.md#ssl-support-on-aws
That said, removing extra hops is a great thing.
@mwitkow
Yes, we eliminate the SNAT (and Masquerade rules shouldn't be hit for traffic with external source ips).
@therc
I didn't need to pass the node IP to kube-proxy for this, atleast on GCE, none of the iptables rules need it for my purposes - did you have a need for it for the local-service implementation ?
I see the following note in that user-guide -
TCP and SSL will select layer 4 proxying: the ELB will forward traffic without modifying the headers.
The AWS LB looks like it will do L7 balancing with source ip transferred to the headers but for generic TCP services it wouldn't work. BTW, does it do websockets and HTTP ?
Please have these discussions on the design proposal. Let's try to keep the
feature repo meta.
On Mon, Jul 25, 2016 at 10:56 AM, Girish Kalele [email protected]
wrote:
@therc https://github.com/therc
I didn't need to pass the node IP to kube-proxy for this, atleast on GCE,
none of the iptables rules need it for my purposes - did you have a need
for it for the local-service implementation ?I see the following note in that user-guide -
TCP and SSL will select layer 4 proxying: the ELB will forward traffic
without modifying the headers.The AWS LB looks like it will do L7 balancing with source ip transferred
to the headers but for generic TCP services it wouldn't work. BTW, does it
do websockets and HTTP ?—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/features/issues/27#issuecomment-235031200,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AADIdQOMWMkobWz7VXMfT1V5oNl2a4Bfks5qZPjngaJpZM4JOTyt
.
Update: Code and docs merged - code activated only with alpha feature-gate flag.
E2Es cannot run till the E2E Jenkins job with the feature-gate ON is ready (WIP).
@girishkalele Please add docs PR numbers in the issue description
Ping. Can you add the docs PR to the issue description?
@girishkalele docs PR question is still actual - please, add the links.
CCed kubernetes/docs in the Docs PR and also added to tracking spreadsheet.
@girishkalele FYI we've encountered an issue with the iptable
rules used for external load balancing:
https://github.com/kubernetes/kubernetes/issues/33081
...this is so exciting. thank you all for your hard work :-) making our lives easier
@kubernetes/sig-network @bprashanth can you clarify the actual status of the feature? It's expected for GCP in beta status, are we expecting it for Kubernetes itself?
@dgomezs may I ask you to create an issue in kubernetes/kubernetes repo? This repo is for managing and tracking the features, the actual coding and bug fixing work happens in that repo.
@idvoretskyi Done.
Is this currently available on GCP in Alpha? If so, how is it enabled? We spun up an alpha cluster today and the real ip is still being replaced with a k8s IP. Used the "service.alpha.kubernetes.io/external-traffic": "OnlyLocal" annotation per http://kubernetes.io/docs/user-guide/load-balancer/#loss-of-client-source-ip-for-external-traffic with no luck.
@bprashanth can you update the actual status of the feature?
@bgabler As far as I understand, you cannot enable alpha features. You need an alpha cluster.
Ping. Is there a docs PR for this?
Not yet, we have till end of week right (isn't realease only Dec 7 ish)?
As of 1.4 it's alpha
As of 1.5 it's beta (same annotation, s/alpha/beta)
You had until yesterday to write a docs PR :) Can you get me one today? Here's a link to the schedule: https://github.com/kubernetes/features/blob/master/release-1.5/release-1.5.md
@girishkalele can you confirm that this item targets stable in 1.6?
Girish is no longer involved in this - @bprashanth has taken responsibility for it.
This is a huge issue for us. Any update on when we can expect this? In the meantime we're running NGINX on compute to add a custom header to the request as it passes through to a k8s service.
This is alpha in 1.4, and beta in 1.5. Go nuts! :)
On Thu, Dec 15, 2016 at 12:15 PM, Ben Gabler notifications@github.com
wrote:
This is a huge issue for us. Any update on when we can expect this? In the
meantime we're running NGINX on compute to add a custom header to the
request as it passes through to a k8s service.—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/features/issues/27#issuecomment-267432245,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVEGnPjWr2XUU_qG3c8UQA0pfBqqWks5rIZ_-gaJpZM4JOTyt
.
@thockin does this only apply to k8s on GCE?
It should work on GCE and Azure for LB VIPs and it should work on any
platform for NodePorts (I think - @bprashanth to confirm)
On Mon, Dec 19, 2016 at 12:07 PM, Hans Kristian Flaatten <
[email protected]> wrote:
@thockin https://github.com/thockin does this only apply to k8s on GCE?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/features/issues/27#issuecomment-268064756,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVDi3xSX_JB7-lNMGArk3S4suTM_fks5rJuQXgaJpZM4JOTyt
.
I am on gke with updated cluster to 1.5.1 in euw1d and forward proto and and ip works but port is 5xxxx instead of 80/443(in my case).
I have nginx as proxy. Maybe I need to use different variable than $remote_port?
@ivanjaros You'll have to be more explicit about what you're hoping to acheive, what you did, and what happened. I don't know how to interpret your message.
I have nginx as proxy. Maybe I need to use different variable than $remote_port?
$server_port
That worked perfectly. Thanks aledbf.
Is it enough to add "service.beta.kubernetes.io/external-traffic": "OnlyLocal"
annotation for services on a 1.5.1 cluster running on bare-metal centos? It doesn't seem to work for me, still sees cluster ip.
That annotation currently works only in GKE.
What leads you to believe it only works on GKE? There's nothing at all GKE specific in this code path.
@rxwen yes, settingWhat leads you to believe it only works on GKE? There's nothing at all GKE specific in this code path. the annotation should be sufficient. Can you tell me if, after that is set, whether you get another annotation set (automatically) with the "heathcheck nodeport" ?
http://kubernetes.io/docs/user-guide/load-balancer/#loss-of-client-source-ip-for-external-traffic
"However, starting in v1.5, an optional beta feature has been added that will preserve the client Source IP for GCE/GKE environments. This feature will be phased in for other cloud providers in subsequent releases."
I don't know. As a matter of fact, after I saw your previous comment, I believe it should work on a bare-metal cluster, nodeports type service (Though @bprashanth hasn't yet confirmed) .
If I understand correctly, after I set this annotation for a service, in addition to this annotation, a new annotation "heathcheck nodeport" should be applied to the service as well?
I'll check it and test again.
Key there is "GCE/GKE" not just GKE :)
On Mon, Jan 2, 2017 at 10:49 PM, Ivan Jaros notifications@github.com
wrote:
http://kubernetes.io/docs/user-guide/load-balancer/#
loss-of-client-source-ip-for-external-traffic"However, starting in v1.5, an optional beta feature has been added that
will preserve the client Source IP for GCE/GKE environments. This feature
will be phased in for other cloud providers in subsequent releases."—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/features/issues/27#issuecomment-270059288,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVH925L5IatAb4G_6eshAUdLjcQibks5rOe-MgaJpZM4JOTyt
.
On Mon, Jan 2, 2017 at 10:49 PM, rxwen notifications@github.com wrote:
>
I don't know. As a matter of fact, after I saw your previous comment, I believe it should work on a bare-metal cluster, nodeports type service (Though @bprashanth hasn't yet confirmed) .
on BareMetal you have the pieces you need to make it work, mostly (I
think the nodeport healthchecking is wrong, @bprashanth we should fix
that ASAP for 1.6 tree), but you have to glue it together yourself
(since no cloud provider module)
If I understand correctly, after I set this annotation for a service, in addition to this annotation, a new annotation "heathcheck nodeport" should be applied to the service as well?
I'll check it and test again.
Correct. That annotation is a signal that your annotation was
received and handled.
on BareMetal you have the pieces you need to make it work, mostly (I think the nodeport healthchecking is wrong, @bprashanth we should fix that ASAP for 1.6 tree), but you have to glue it together yourself (since no cloud provider module)
Does it mean on a 1.5.1 cluster, in addition to setting the annotation for a service, I need to make other configuration to make it work? If so, any guidance about what other configuration should I make?
On GCE it should just work.
On BareMetal, you need to configure the load-balancer to use the new
healthcheck-nodeport when routing traffic toward VMs, so it will not send
traffic to a VM with no backends.
On Mon, Jan 2, 2017 at 11:00 PM, rxwen notifications@github.com wrote:
on BareMetal you have the pieces you need to make it work, mostly (I think
the nodeport healthchecking is wrong, @bprashanth
https://github.com/bprashanth we should fix that ASAP for 1.6 tree),
but you have to glue it together yourself (since no cloud provider module)Does it mean on a 1.5.1 cluster, in addition to setting the annotation for
a service, I need to make other configuration to make it work? If so, any
guidance about what other configuration should I make?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/features/issues/27#issuecomment-270060051,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVMkCQgIYdTgtWDFUI4GcdXoyxpc9ks5rOfIZgaJpZM4JOTyt
.
Does this feature have the nice side effect of fixing the lack of node health-checks in GCE L4 LBs created by "type: LoadBalancer" service definitions ? This is obviously causing service interruptions when upgrading nodes, because the L4 LB keeps sending traffic to down nodes, in the absence of any health check.
Is enabling implementing source IP preservation, even when you don't really care about the source IP, the recommended solution to this problem ? See my other comment here : https://github.com/kubernetes/kubernetes/issues/32827#issuecomment-271662756
Status update: I'm working on moving this feature to GA in 1.6.
We've tried enabling on GKE, no luck
@bgabler You should be able to use this feature with v1.5 on GKE, or v1.4 with Alpha Cluster. What issue did you encounter enabling this?
Just to confirm, we add the following annotation to a loadbalancer:
service.beta.kubernetes.io/external-traffic: "OnlyLocal"
Example:
apiVersion: v1
kind: Service
metadata:
labels:
name: foobar
app: foobar
release: "foobar"
heritage: "foobar"
name: "foobar"
annotations:
service.beta.kubernetes.io/external-traffic: "OnlyLocal"
namespace: "foobar"
spec:
clusterIP:
ports:
- name: app
port: 443
selector:
name: "foobar"
sessionAffinity: None
type: LoadBalancer
@bgabler As this might not be the right place to discuss, could you please open an issue regarding this on kubernetes/kubernetes and provide more details and ping me? Thanks!
Are you on kube v1.5 to use the "beta" annotation?
On Tue, Jan 24, 2017 at 1:32 PM, Zihong Zheng notifications@github.com
wrote:
@bgabler https://github.com/bgabler As this might not be the right
place to discuss, could you please open an issue regarding this on
kubernetes/kubernetes and provide more details and ping me? Thanks!—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/features/issues/27#issuecomment-274945401,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVKI-RdlAr0jiXsIcRoAM_XgoWQfvks5rVm3jgaJpZM4JOTyt
.
Not sure what changed but its actually working now.
We did notice some weird http2 issues so we had to tell our tyk gateway
to turn off http2
Ben Gabler
Email: [email protected]
Website: http://www.bengabler.com
LinkedIN: http://www.linkedin.com/in/bengabler
On 1/24/17 4:51 PM, Tim Hockin wrote:
Are you on kube v1.5 to use the "beta" annotation?
On Tue, Jan 24, 2017 at 1:32 PM, Zihong Zheng notifications@github.com
wrote:@bgabler https://github.com/bgabler As this might not be the right
place to discuss, could you please open an issue regarding this on
kubernetes/kubernetes and provide more details and ping me? Thanks!—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/kubernetes/features/issues/27#issuecomment-274945401,
or mute the thread—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/features/issues/27#issuecomment-274965256,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AG7dYAntLxnXEcXU1dIekGjxTP7bBkSuks5rVoCBgaJpZM4JOTyt.
Is this feature supported in AWS ?
I tried adding the proposed annotation to the service but it didn't work! I have created a repo with config to create a minimalistic cluster to replicate my issue. (with same config its working in GKE just like that)
@Phanindra48 I'm afraid it is not fully supported on AWS yet. Please use kubernetes/kubernetes#35758 and kubernetes/kubernetes#35187 for tracking.
Oh! thanks for the info @MrHohn. Is there any work around for making it work in aws as our application depends on source ip?
I have created a repo that contains config to create a cluster in aws (our cluster is almost similar to that).
AWS's ELB does not preserve the client IP at all. Whether Kubernetes saves
it or not won't matter much.
On Mar 5, 2017 7:00 PM, "Phanindra Pydisetty" notifications@github.com
wrote:
Oh! thanks for the info @MrHohn https://github.com/MrHohn. Is there any
work around for making it work in aws as our application depends on source
ip?I have created repo https://github.com/Phanindra48/k8s-source-ip-aws
that contains config to create a cluster in aws (our cluster is almost
similar to that).—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/features/issues/27#issuecomment-284291477,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVFVCvm9vzYnykCzA8Ho8VfP9Cun-ks5ri3blgaJpZM4JOTyt
.
The Classic AWS LB's do preserve the remote IP, in the way that L7 LB's are expected to, using X-Forwarded-For
, see: https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/x-forwarded-headers.html
@Phanindra48 what is your application? Is it HTTP? Because on AWS you have (at least) 3 choices for preserving the source IP:
a) Enable proxy protocol on the ELB using an annotation (any protocol, but requires you to parse the proxy protocol header)
b) Set the ELB into layer 7 mode using an annotation (HTTP/HTTPS only)
c) Use ingress with the nginx controller, enabling proxy protocol and telling nginx to process the proxy protocol for you
I like option c best. Option b works well also, but you are dealing with the limitations of ELB HTTP support at that point. Option a works with non-HTTP/HTTPS, but requires code changes.
Hi @justinsb, We got a micro-services architecture with around 10 services running on NodePort under nginx-ingress controller which is of type LoadBalancer (only external endpoint for us) and we have an ingress
for all rules to divert the traffic to different services.
I tried adding annotation like shown here https://github.com/Phanindra48/k8s-source-ip-aws/blob/master/nginx-controller.yaml, but it didn't work. It actually removed the instances from LoadBalancer (couldn't register because of failing health checks).
So, as I'm using ingress with the nginx controller, option c seems like the best approach. Could you please tell which command should I add in nginx configmap to make it work?
Sorry, if I may ask a dump question: I try to get it working to preserve the external IP of the client while accessing a service in my cluster using NodePort. So I set up a nginx pod and I configured a service using NodePort an set the above mentioned "OnlyLocal" annotation. As long as the service is of type NodePort no "healthcheck nodeport" ist generated. But I can see the iptables rules dropping the traffic if no pod is on local node. So far, so good. But if I access the NodePort from external on the right node, I still see the MASQ adress in the logs of NGINX.
So I tried to set up the service of type LoadBalancer (but without having a loadbalancer in place). Then I get the "healthcheck nodeport" annotation right away. I can curl that port and /healthz on the node with local backends and get a count >0. But if I access the opened Port from external I still see the MASQ adress in the logs of NGINX.
Did I get something wrong? Shouldn't the IP adress in this case be the external adress of the client.
I tried that on an 1.5 Kubernetes Cluster and I also reinstalled the whole cluster using recent 1.6.1 Kubernetes .dep on Debian Jessie. With the same result. Could this be related to the Weave Network Overlay I use? I installed Weave-1.6 as of the documentation. Is this feature not supported using Weave? Did I get something else wrong? Did I get it wrong at all?
Thank you in advance for any help!
@tomte76 There wouldn't be healthcheck nodeport assigned to OnlyLocal NodePort service. This is the expected behavior.
For the specific issue you encountered (seeing MASQ adress in logs), it would be great if you could provide more details --- which cloud provider, what commands you ran, iptables-save output on node, tcpdump result.
Could you please open a new issue on kubernetes/kubernetes and ping me so that we could follow up there. This feature issue is not the right place for bug report.
Acutally I have to admit, that I don't know how to "ping" :) I opened an issue #44963 (Unexpected behaviour using "OnlyLocal" annotation on NodePort with 1.6.1 and Weave-Net 1.6) and provided some information. If you need any more please let me know
@MrHohn Should this be listed for GA in 1.7
@cmluciano Yes, kubernetes/kubernetes#41162 is out for this purpose.
/assign
Trying to get this feature working on v1.7.0-alpha.2 using flanneld with --ipmasq
and kube-proxy with iptables mode. I want to receive traffic on a node directly and preserve the source IP as it is routed to the ingress-controller. I've tried using the NodePort service with the service.beta.kubernetes.io/external-traffic: OnlyLocal
annotation, but echoheaders is still giving me 10.100.35.1
as the realip (where 10.100.35.x is the flannel/docker subnet for the node). Is it possible to set this out without a cloud provider or service of type LoadBalancer?
@Malet This source IP preservation feature has issues with some overlay network plugins (like flannel and weave), please see kubernetes/kubernetes#44963 for details.
@MrHohn @bprashanth @kubernetes/sig-network-feature-requests please, update the feature description to fit the new template.
@idvoretskyi Could you please copy my previous post onto the top one? I don't yet have the privilege to edit post. Thanks!
@MrHohn thanks!
Quick update on this feature:
I'm now working on updating the existing docs to reference the new first class fields instead of beta annotations.
@MrHohn thanks
I've tried running with NodePort and the OnlyLocal annotation (bare-metal/flannel) but receive no traffic on local pod. When the annotation is applied to the service, packets are not marked for masquerading in iptables but still always dropped by a rule that states "servicexyz has no local endpoints". This is not the case, however - the service does indeed have load endpoints. My guess is that, the health-check is failing. Is this likely the case, and if so, what do I need to do in order that the health-check succeeds?
@liggetm Please open an issue for this on the main kubernetes repository.
@cmluciano will do.
@idvoretskyi This feature is complete for 1.8. Is there anymore to do with this issue or do we just close it?
This feature was promoted to GA in 1.7 and the release has been shipped. I believe we can close this.
/close
I've noticed that on GKE running a NodePort service + Ingress that this feature causes the backend to be marked as "UNHEALTHY" when I inspect the ingress with kubectl describe ing ingress
(presumably because most nodes are failing the health check). Are there any negative side-effects to this or is it safe to ignore?
@unclewizard Is your GET /
are available (HTTP code, 200 OK
) for GCP healthy check?
@chy168 Yep - here is a simple way to reproduce:
Follow the instructions here: https://cloud.google.com/container-engine/docs/tutorials/http-balancer but when you get to Step 2, use the following yaml to create the nginx service:
apiVersion: v1
kind: Service
metadata:
name: nginx
annotations:
service.beta.kubernetes.io/external-traffic: "OnlyLocal"
spec:
selector:
run: nginx
ports:
- name: nginx
port: 80
type: NodePort
After you create the Ingress service in Step 3, wait about 10 minutes and then do a kubectl describe ing basic-ingress
. The output will contain something like:
...
backends: {"k8s-be-30910--7b4223ab4c1af15d":"UNHEALTHY"}
If you visit the Ingress address you'll see that the page loads fine. I suspect the UNHEALTHY message doesn't actually matter but I wanted to make sure this is expected.
@unclewizard That is expected. When externalTrafficPolicy
is set to Local
, nodes that don't have backend pods run on it will intentionally fail LB health check.
Though backend service could contain multiple backends, the backends
annotation only shows the healthiness of one of them, that seems incorrect. Opened https://github.com/kubernetes/ingress/issues/1395.
That is expected. When externalTrafficPolicy is set to Local, nodes that don't have backend pods run on it will intentionally fail LB health check.
Sorry, I said it wrong. We internally fail healthcheck for L4 LB but not L7. Backend service shouldn't show as "UNHEALTHY".
@unclewizard Could you open a separate issue on https://github.com/kubernetes/ingress? Thanks.
Sorry for spamming, please ignore what I said on previous post. I need some sleep :(
Most helpful comment
This is alpha in 1.4, and beta in 1.5. Go nuts! :)
On Thu, Dec 15, 2016 at 12:15 PM, Ben Gabler notifications@github.com
wrote: