Argo-cd: Sync: ssh: handshake failed: EOF

Created on 29 Sep 2020  路  7Comments  路  Source: argoproj/argo-cd

If you are trying to resolve an environment-specific issue or have a one-off question about the edge case that does not require a feature then please consider asking a question in argocd slack channel.

Checklist:

  • [x] I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • [x] I've included steps to reproduce the bug.
  • [x] I've pasted the output of argocd version.

Describe the bug

Every 10-15 mins I'm getting "ssh: handshake failed: EOF" when running a sync. I'm also able to see it in the repo server logs when it's auto-syncing itself. It seems more likely to happen when I invalidate the cache on the clusters. I wonder if it does too many requests which causes ssh interruptions.

To Reproduce

ArgoCD server with 100+ apps and 4500+ resources. Running 5 AWS EKS clusters and sourcing repos from Bitbucket.

Expected behavior

Sync without ssh EOF failure.

Screenshots
image

Version

v1.7.6+b04c25e

Logs

rpc error: code = Unknown desc = ssh: handshake failed: EOF
bug buin-triage

Most helpful comment

I don't know whether Bitbucket has a rate limiting on Git requests via SSH, but with 100+ apps Git requests can become quite numerous with the default settings (especially the default refresh interval of 3m). The number of Git requests are exported as a metric from the repo-server, so this might give you a good indicator of how many they are indeed.

You could try to decrease the number of SSH requests by increasing the refresh-check interval to something more sensible, and set-up webhooks on your Bitbucket server for refreshing your applications whenever they change.

All 7 comments

I don't know whether Bitbucket has a rate limiting on Git requests via SSH, but with 100+ apps Git requests can become quite numerous with the default settings (especially the default refresh interval of 3m). The number of Git requests are exported as a metric from the repo-server, so this might give you a good indicator of how many they are indeed.

You could try to decrease the number of SSH requests by increasing the refresh-check interval to something more sensible, and set-up webhooks on your Bitbucket server for refreshing your applications whenever they change.

@jannfis thanks, I'll also raise a ticket with Bitbucket regarding this, and provide details here if any relevant to ArgoCD.

So far, I've run a continuous git clone with some verbose output on the same instance as the ArgoCD repo server is running, but even when I saw the ssh EOF error in ArgoCD, I wasn't able to see any interruptions in the ssh git clone.
Ran it as follows to catch any issues, but no luck:

while true; do
GIT_TRACE_PACKET=1 GIT_TRACE=1 GIT_SSH_COMMAND="ssh -v" GIT_CURL_VERBOSE=1 git clone [email protected]:org/repo.git || break
rm -rf repo
done

The hourly clone limit for bitbucket is 30.000, which I am not hitting anywhere close looking at the repo-server metrics.

The hourly clone limit for bitbucket is 30.000, which I am not hitting anywhere close looking at the repo-server metrics.

You should also take a look at the ls-remote metrics from the repo-server, because I imagine they count into the limits of BitBucket as well.

However, an eof error with SSH almost always indicates that the peer (or some intermediate component in between) has abruptly dropped the connection. Maybe this could also be related to #3994.

same question in gitlab.
Gitlab said rate limit of download archives is 5 requests per minute per user whatever through the UI or the API.

The hourly clone limit for bitbucket is 30.000, which I am not hitting anywhere close looking at the repo-server metrics.

You should also take a look at the ls-remote metrics from the repo-server, because I imagine they count into the limits of BitBucket as well.

However, an eof error with SSH almost always indicates that the peer (or some intermediate component in between) has abruptly dropped the connection. Maybe this could also be related to #3994.

hi i run argocd 1.7.5, but I don't see any metrics related to ssh on both argocd-metrics and argocd-server-metrics.

I found it in argocd-repo-server which is not enabled in kustomize installation.

We have the same issue, with argocd and gitlab (foss on-premise).

Was this page helpful?
0 / 5 - 0 ratings