When I start kubectl port-forward svc/leeroy-app 50053:50051 it works the first time.
If I kill the pod behind the service, kubernetes restarts the pod, and then the port forwarding starts failing:
Handling connection for 50053
Handling connection for 50053
E0722 16:21:00.929687 155541 portforward.go:340] error creating error stream for port 50053 -> 50051: Timeout occured
E0722 16:21:00.969972 155541 portforward.go:362] error creating forwarding stream for port 50053 -> 50051: Timeout occured
E0722 16:21:02.989783 155541 portforward.go:362] error creating forwarding stream for port 50053 -> 50051: Timeout occured
E0722 16:21:03.998054 155541 portforward.go:362] error creating forwarding stream for port 50053 -> 50051: Timeout occured
E0722 16:21:04.598329 155541 portforward.go:340] error creating error stream for port 50053 -> 50051: Timeout occured
E0722 16:21:05.577799 155541 portforward.go:362] error creating forwarding stream for port 50053 -> 50051: Timeout occured
Handling connection for 50053
E0722 16:21:06.166770 155541 portforward.go:362] error creating forwarding stream for port 50053 -> 50051: Timeout occured
E0722 16:21:35.578937 155541 portforward.go:340] error creating error stream for port 50053 -> 50051: Timeout occured
Handling connection for 50053
Handling connection for 50053
E0722 16:21:40.688533 155541 portforward.go:400] an error occurred forwarding 50053 -> 50051: error forwarding port 50051 to pod 6b8250b5be8d3e65ed5d9c900cb87966bed006b57cc81617d27b6ba271742815, uid : Error: No such container: 6b8250b5be8d3e65ed5d9c900cb87966bed006b57cc81617d27b6ba271742815
E0722 16:22:10.606373 155541 portforward.go:340] error creating error stream for port 50053 -> 50051: Timeout occured
Handling connection for 50053
Handling connection for 50053
E0722 16:22:40.712581 155541 portforward.go:340] error creating error stream for port 50053 -> 50051: Timeout occured
E0722 16:22:40.712668 155541 portforward.go:340] error creating error stream for port 50053 -> 50051: Timeout occured
If I kill manually kubectl port forwarding and restart, it works.
I would love to see the recovery automatically instead of having to parse the output and restart manually.
We are building portforwarding into our application through kubectl and this would help a lot with the integration.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/reopen
@jjfmarket: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I also see this behavior, my port forwards start failing after I restart the pod that was being forwarded to.
/reopen
@jjfmarket: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Isn't there any suggested implementation to implement this automatic recovery?
/reopen
@brianpursley: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I was taking a look at this a little bit today and I think this is a legitimate issue.
The problem seems to be that port forwarding enters some sort of unrecoverable state after it is no longer able to communicate with the pod it was connected to, and yet it does not fail with an exit code either.
terminal 1
kubectl run sysinfo --image=brianpursley/system-info
terminal 2
kubectl port-forward sysinfo 8080:80
Open a browser or curl to make some requests to http://localhost:8080 and verify that port forwarding is working
terminal 1
kubectl delete pod sysinfo
Open a browser or curl to make some requests to http://localhost:8080 and verify that port forwarding is no longer working
terminal 2
You will see some errors like these:
Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
E1002 15:12:34.808176 125749 portforward.go:400] an error occurred forwarding 8080 -> 80: error forwarding port 80 to pod e2cb7d04631d95df43a87ad38952a027074a146da9ff85c43866c4e2b2806009, uid : exit status 1: 2020/10/02 15:12:34 socat[2905824] E connect(5, AF=2 127.0.0.1:80, 16): Connection refused
Handling connection for 8080
E1002 15:12:34.822191 125749 portforward.go:400] an error occurred forwarding 8080 -> 80: error forwarding port 80 to pod e2cb7d04631d95df43a87ad38952a027074a146da9ff85c43866c4e2b2806009, uid : exit status 1: 2020/10/02 15:12:34 socat[2905825] E connect(5, AF=2 127.0.0.1:80, 16): Connection refused
Handling connection for 8080
E1002 15:12:34.835750 125749 portforward.go:400] an error occurred forwarding 8080 -> 80: error forwarding port 80 to pod e2cb7d04631d95df43a87ad38952a027074a146da9ff85c43866c4e2b2806009, uid : exit status 1: 2020/10/02 15:12:34 socat[2905826] E connect(5, AF=2 127.0.0.1:80, 16): Connection refused
The problem is that kubectl port-forward never exits, and even if I do kubectl run sysinfo --image=brianpursley/system-info it is not able to reestablish a connection, so it is sort of stuck in some invalid state.
NOTE: My example above is for a single pod, but you can port-forward to a service or deployment, in which case it will select a single pod within the deployment and forward to that pod only. You can follow similar steps to reproduce the issue with a deployment, but you have to find the pod it is connect to and delete that pod to see the effect.
/remove-lifecycle rotten
let try to reproduce this report and work on it.
/assign
Hey @soltysh, I am wondering if we can discuss this one in the sig meeting. Should os.Exit(1) enough for this one ? Just tested a local patch and it works.
/priority backlog
/kind bug
Hey @soltysh, I am wondering if we can discuss this one in the sig meeting. Should os.Exit(1) enough for this one ? Just tested a local patch and it works.
@dougsland just open a PR and pls ping me on slack with it, I'll review
/priority backlog
/kind bug
Hey @soltysh, I am wondering if we can discuss this one in the sig meeting. Should os.Exit(1) enough for this one ? Just tested a local patch and it works.
@dougsland just open a PR and pls ping me on slack with it, I'll review
Spoke on slack. We don't exit from library code. The library code in client-go is starting a server to forward requests, so it's behavior is like ListenAndServe. We don't expect it to exit on failures, the same way we don't expect to have an http server exit on errors. Anything that gets added would need coordination in a high layer of logic in kubectl.
If someone actively pursues it, I think it will be important to write down the conditions for behavior changes in kubectl and then provide a way to expose the information from the port-forwarding server back to kubectl. I don't expect it to be a small fix, since there are many different reasons for failures.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-contributor-experience at kubernetes/community.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-contributor-experience at kubernetes/community.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
Isn't there any suggested implementation to implement this automatic recovery?