I have recently added istio charts and i already have promethius, autoscaler, external secreates, flux and fluent-d cloudwatch. So total ...~30 pods
# Copy-paste your Terraform configurations here - for large Terraform configs,
# please use a service like Dropbox and share a link to the ZIP file. For
# security, you can also encrypt the files using our GPG public key.
2020/05/14 12:15:14 [TRACE] dag/walk: vertex "provider.helm (close)" is waiting for "module.eks_addons.helm_release.cluster_autoscaler"
2020/05/14 12:15:19 [TRACE] dag/walk: vertex "root" is waiting for "provider.helm (close)"
2020/05/14 12:15:19 [TRACE] dag/walk: vertex "provider.helm (close)" is waiting for "module.eks_addons.helm_release.cluster_autoscaler"
2020/05/14 12:15:24 [TRACE] dag/walk: vertex "root" is waiting for "provider.helm (close)"
2020/05/14 12:15:24 [TRACE] dag/walk: vertex "provider.helm (close)" is waiting for "module.eks_addons.helm_release.cluster_autoscaler"
2020/05/14 12:15:29 [TRACE] dag/walk: vertex "root" is waiting for "provider.helm (close)"
2020/05/14 12:15:29 [TRACE] dag/walk: vertex "provider.helm (close)" is waiting for "module.eks_addons.helm_release.cluster_autoscaler"
2020/05/14 12:15:34 [TRACE] dag/walk: vertex "root" is waiting for "provider.helm (close)"
2020/05/14 12:15:34 [TRACE] dag/walk: vertex "provider.helm (close)" is waiting for "module.eks_addons.helm_release.cluster_autoscaler"
2020/05/14 12:15:39 [TRACE] dag/walk: vertex "root" is waiting for "provider.helm (close)"
2020/05/14 12:15:39 [TRACE] dag/walk: vertex "provider.helm (close)" is waiting for "module.eks_addons.helm_release.cluster_autoscaler"
2020/05/14 12:15:44 [TRACE] dag/walk: vertex "root" is waiting for "provider.helm (close)"
2020/05/14 12:15:44 [TRACE] dag/walk: vertex "provider.helm (close)" is waiting for "module.eks_addons.helm_release.cluster_autoscaler"
terraform applyRecently Terraform v0.12.25 was released, which contained a fix for concurrency bug (https://github.com/hashicorp/terraform/blob/v0.12.25/CHANGELOG.md). If you happen to use that version, can you try downgrading to v0.12.24 or lower?
I experienced random crashes before (https://github.com/terraform-providers/terraform-provider-helm/issues/494) and also encountered the issue mentioned by you after an upgrade to new Terraform.
... can you try downgrading to v0.12.24...
@krzysztof-miemiec I can consistently reproduce with 0.12.24 and 0.12.25.
Also, @amitsehgal is this a dup of #458 ?
Having zero experience with Go, I began to try to debug this issue. I installed GoLand, learned how to use fmt.Printf and built a dumb, yet working "test" pipeline that overrides helm provider used by my TF module & rewrites checksum in lockfile ๐.
And I found out that it hangs in this place (sorry for no line numbers or specific stack trace):
if err := g.cfg.KubeClient.IsReachable(); err != nil {
@ vendor/helm.sh/helm/v3/pkg/action/get.go: func (g *Get) Run(name string)
res, err := get.Run(name)
@ helm/resource_release.go: func getRelease(cfg *action.Configuration, name string)
I noticed in this https://github.com/terraform-providers/terraform-provider-helm/issues/458#issuecomment-622148508 that there's a problem with IAM Authenticator (I also use it with my EKS setup). Will try to find out more.
Edit:
I traced it down to this specific line (os.Stderr.write() hangs inside klog):
klog.Warningf("constructing many client instances from the same exec auth config can cause performance problems during cert rotation and can exhaust available network connections; %d clients constructed calling %q", onRotateListLength, a.cmd)
when loading client for schedulingv1 (somewhere in kubernetes client initialization)
in k8s.io/client-go/plugin/pkg/client/auth/exec.(*Authenticator).UpdateTransportConfig
What's weird is that when I comment out this line, or even change that to log.printf the whole helm provider does not hang.
I can confirm what @krzysztof-miemiec reported here, the blocking goes away if either the klog call gets removed or if I manually disable the stderr/stdout redirection in the plugin:
https://github.com/hashicorp/go-plugin/blob/master/server.go#L461
I tried to understand how the pipes of stderr/stdout is consumed over the grpc connection but did not manage to understand it from the code so far.
This is resolved in the latest release
This is how i fixed similar issue :
export TF_LOG=TRACE which is the most verbose logging.terraform plan ....dag/walk: vertex "module.kubernetes_apps.provider.helmfile (close)" is waiting for "module.kubernetes_apps.helmfile_release_set.metrics_server"
From logs, I identify the state which is the cause of the issue: module.kubernetes_apps.helmfile_release_set.metrics_server.
I deleted its state :
terraform state rm module.kubernetes_apps.helmfile_release_set.metrics_server
terraform plan again should fix the issue.This is not the best solution, that's why I contacted the owner of this provider to fix the issue without this workaround.
I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error ๐ค ๐ , please reach out to my human friends ๐ [email protected]. Thanks!
Most helpful comment
I can confirm what @krzysztof-miemiec reported here, the blocking goes away if either the klog call gets removed or if I manually disable the stderr/stdout redirection in the plugin:
https://github.com/hashicorp/go-plugin/blob/master/server.go#L461
I tried to understand how the pipes of stderr/stdout is consumed over the grpc connection but did not manage to understand it from the code so far.