Checklist:
argocd version.Describe the bug
We are experiencing problems the argo-cd repo server. The application opens https connection to our git server and does not close them as expected. We have now traced the problem to the worker node where the pod is running. By attaching to the $pid of the pod we can watch the number of open sockets rise continuously.
watch "ls -lah /proc/$pid/fd/ | wc"
Using nsenter we can also get a more detailed overview over the open connections. Here is the output when we run netstat in the argo-repo-server namespace:
nsenter -t 3659754 -n netstat -pantu
```
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 172.27.4.81:51880 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:60360 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:45428 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:36380 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:34510 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:58658 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:36830 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:35722 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:55286 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:40218 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:52948 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:45712 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp 0 0 172.27.4.81:59698 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
[....... 14k times ......]
tcp 0 0 172.27.4.81:38964 censored-by-our-it-sec:443 ESTABLISHED 3659754/argocd-repo
tcp6 0 0 :::8081 :::* LISTEN 3659754/argocd-repo
tcp6 0 0 :::8084 :::* LISTEN 3659754/argocd-repo
tcp6 0 0 172.27.4.81:8081 172.25.2.52:44604 TIME_WAIT -
**To Reproduce**
- connect a argocd-application to a repo using https
- watch the tcp connections for the $pid of the repo-server process (nsenter -t $pid-n netstat -pantu)
**Expected behavior**
The repo server opens connections to the git repository and closes the connections after processing.
**Version**
```shell
argocd: v1.5.2+c2c19f4
BuildDate: 2020-04-15T16:41:59Z
GitCommit: c2c19f42ad78ed7a6fb70e86aed117be484feb50
GitTreeState: clean
GoVersion: go1.14
Compiler: gc
Platform: linux/amd64
argocd-server: v1.5.2+c2c19f4
BuildDate: 2020-04-15T16:43:12Z
GitCommit: c2c19f42ad78ed7a6fb70e86aed117be484feb50
GitTreeState: clean
GoVersion: go1.14
Compiler: gc
Platform: linux/amd64
Ksonnet Version: v0.13.1
Kustomize Version: Version: {Version:kustomize/v3.2.1 GitCommit:d89b448c745937f0cf1936162f26a5aac688f840 BuildDate:2019-09-27T00:10:52Z GoOs:linux GoArch:amd64}
Helm Version: version.BuildInfo{Version:"v3.1.1", GitCommit:"afe70585407b420d0097d07b21c47dc511525ac8", GitTreeState:"clean", GoVersion:"go1.13.8"}
Kubectl Version: v1.14.0
Our Workaround
We switched to ssh instead of https. Using ssh, the connections to out git server are closed correctly as we would expect.
Maybe the open connections are caused by the "customHTTPClient" from https://github.com/argoproj/argo-cd/blob/9e81c38c13be708cb7f1d280ae93ddeb59131305/util/git/client.go
Thanks for your brilliant bug report, @Jaydee94!
I could reproduce the issue reliably and can confirm it. The repo server is indeed leaking TCP connections for HTTPS repository requests. I'd consider this bug as severe.
I think I found the root cause, and it was indeed the custom HTTP client implementation that was keeping HTTP connections alive.
I just submitted a PR with the fix (#3531) - with this fix applied, TCP connections to Git repositories connected via HTTPS should be closed correctly and in time now, at least they do in my reproduction environment.
Hi,
as we are also running into this issue, I was just curious when this bugfix will be released? Thanks for the great work!
@szEvEz As far as i know the bugfix is already part of the newest release 1.5.4
@szEvEz @Jaydee94 Sorry, this patch didn't make it in the 1.5.x branch yet. We thought to give it a thorough test before merging it in one of the fix release.
@alexmt what is the plan for this fix? Could we collect a few metrics on latency or performance issues?
@jannfis @Jaydee94 Thanks for reaching out so quickly! And really no need to apologise, just wanted to check the current status :) . Thanks
Most helpful comment
Thanks for your brilliant bug report, @Jaydee94!
I could reproduce the issue reliably and can confirm it. The repo server is indeed leaking TCP connections for HTTPS repository requests. I'd consider this bug as severe.