Describe the bug
Context timeout in the getConnectionState
config := cluster.RESTConfig()
config.Timeout = time.Second
Hard-coded 1 second timeout is too short (sporadically the getConnectionState is successful).
To Reproduce
argo-cd running in EKS cluster in the eu-central-1 checking state of clusters from sa-east-1 and ap-northeast-1. Each cluster uses aws-iam-authenticator.
Expected behavior
Higher or configurable timeout.
Version
argocd: v1.3.6+89be1c9
BuildDate: 2019-12-10T22:46:45Z
GitCommit: 89be1c9ce6db0f727c81277c1cfdfb1e385bf248
GitTreeState: clean
GoVersion: go1.12.6
Compiler: gc
Platform: linux/amd64
argocd-server: v1.3.6+89be1c9
BuildDate: 2019-12-10T22:47:48Z
GitCommit: 89be1c9ce6db0f727c81277c1cfdfb1e385bf248
GitTreeState: clean
GoVersion: go1.12.6
Compiler: gc
Platform: linux/amd64
Ksonnet Version: v0.13.1
Kustomize Version: Version: {Version:kustomize/v3.2.1 GitCommit:d89b448c745937f0cf1936162f26a5aac688f840 BuildDate:2019-09-27T00:10:52Z GoOs:linux GoArch:amd64}
Helm Version: v2.15.2
Kubectl Version: v1.14.0
Logs
$argocd cluster list
SERVER NAME VERSION STATUS MESSAGE
https://xxxxxxx.yl4.sa-east-1.eks.amazonaws.com eks-dev2-fm-sa-east-1 Failed Unable to connect to cluster: Get https://xxxxxxx.yl4.sa-east-1.eks.amazonaws.com/version?timeout=1s: context deadline exceeded
https://yyyyyyy.yl4.ap-northeast-1.eks.amazonaws.com eks-dev3-fm-ap-northeast-1 Failed Unable to connect to cluster: Get https://yyyyyyy.yl4.ap-northeast-1.eks.amazonaws.com/version?timeout=1s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Same issue here. Status is reported failed, but the connection and deployments are actually working.
Good chance to fix https://github.com/argoproj/argo-cd/issues/2764 and https://github.com/argoproj/argo-cd/issues/1885.
This should be easier to fix after 1.4 release. We've introduced prometheus metrics collector that periodically collects and exports information about managed clusters.
The same code can save cluster information in the annotations of a secret that stores cluster credentials. UI can use that information to show accurate connectivity status and more stats (number of resources, events etc).
I'm also having trouble with this. Please fix it.