Rancher: [helm] [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directory

Created on 3 Mar 2019  路  42Comments  路  Source: rancher/rancher

What kind of request is this (question/bug/enhancement/feature request):
bug

Steps to reproduce (least amount of steps as possible):

  1. add helm repo via rancher UI
    repo: https://raw.githubusercontent.com/jasonsoft/helm/master/
    name: jasonsoft
    level: global
  2. lauch app from category. Search "whitemos" and click "view detail" button
  3. just click "lauch" button

Result:
got an error message on the page.
Helm template failed. Error: stat template-dir: no such file or directory : exit status 1

Other details that may be helpful:
If we use helm cli to install the package via console, it works.

error message from rancher

2019/03/03 06:19:02 [ERROR] AppController p-xrqw7/whitemos-md4q2 [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directory

Environment information

  • Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI):
  • Installation option (single install/HA):

Rancher/Rancher: v2.2.0-rc2, single node

Cluster information

  • Cluster type (Hosted/Infrastructure Provider/Custom/Imported):
  • Machine type (cloud/VM/metal) and specifications (CPU/memory):
  • Kubernetes version (use kubectl version):

Cluster type: imported ( cluster created via kubeadm)
Machine type: vm

Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.6", GitCommit:"ab91afd7062d4240e95e51ac00a18bd58fddd365", GitTreeState:"clean", BuildDate:"2019-02-26T12:59:46Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.6", GitCommit:"ab91afd7062d4240e95e51ac00a18bd58fddd365", GitTreeState:"clean", BuildDate:"2019-02-26T12:49:28Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}

  • Docker version (use docker version):
Client:
 Version:           18.06.3-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        d7080c1
 Built:             Wed Feb 20 02:26:51 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.3-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       d7080c1
  Built:            Wed Feb 20 02:28:17 2019
  OS/Arch:          linux/amd64
  Experimental:     false

Done arecatalog internal kinbug

Most helpful comment

This issue seems to occur when the url to the packaged chart is relative. A workaround here is to pass --url in when building the index.yaml.

Example to get it working on 10.0.0.100:

docker run --rm -it \
  -p 8080:8080 \
  -e STORAGE=local \
  -e STORAGE_LOCAL_ROOTDIR=/charts \
  chartmuseum/chartmuseum:latest --chart-url http://10.0.0.100:8080

Tested with rancher-2.2.3.

All 42 comments

+1

Same issue with 2.2.0 (latest). Is there a way to remove the failed deployment?

2.2.0 can't install anything

2019/03/28 06:50:10 [ERROR] AppController p-rslbq/rabbitmq [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directo                                                ry
: exit status 1
2019/03/28 06:50:11 [ERROR] AppController p-rslbq/rabbitmq [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directo                                                ry
: exit status 1
2019/03/28 06:50:11 [ERROR] AppController p-rslbq/rabbitmq [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directo                                                ry
: exit status 1
2019/03/28 06:50:11 [ERROR] AppController p-rslbq/rabbitmq [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directo                                                ry
: exit status 1
2019/03/28 06:50:11 [ERROR] AppController p-rslbq/rabbitmq [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directo                                                ry
: exit status 1
2019/03/28 06:50:11 [ERROR] AppController p-rslbq/rabbitmq [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directo                                                ry
: exit status 1
2019/03/28 06:50:11 [ERROR] AppController p-rslbq/rabbitmq [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directo                                                ry
: exit status 1
2019/03/28 06:50:12 [ERROR] AppController p-rslbq/rabbitmq [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directo                                                ry
: exit status 1
2019/03/28 06:50:12 [ERROR] AppController p-rslbq/rabbitmq [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directo                                                ry
: exit status 1
2019/03/28 06:50:12 [ERROR] AppController p-rslbq/rabbitmq [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directo                                                ry
: exit status 1
2019/03/28 06:50:12 [ERROR] AppController p-rslbq/rabbitmq [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directo                                                ry
: exit status 1

Same issues in 2.2.0, working in console with the helm-cli but not with the catalog

+1

Even after removing the app from the UI days ago, my Rancher (standalone) logs are still flooded with dozens of those error messages per second (!)鈥攊s there any way to stop this?

+1

Got the same issues in 2.2.1

Helm template failed. Error: stat template-dir: no such file or directory : exit status 1

I have the problem at 2.2.1

Same issue on rancher v2.2.1.

2019/04/16 03:22:08 [ERROR] AppController p-tsb5h/nfs-client-provisioner [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directory
: exit status 1
2019/04/16 03:22:08 [ERROR] AppController p-tsb5h/nfs-client-provisioner [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directory
: exit status 1

same issue in v2.2.2

2019/04/17 22:11:24 [ERROR] AppController p-mqwf9/external-dns [helm-controller] failed with : helm template failed. Error: stat template-dir: no such file or directory

same v2.2.2
"message": "helm template failed. Error: stat template-dir: no such file or directory\n: exit status 1",

Confirming on v2.2.2. I'm not even able to show previews.

Do you know if it worked on some previous version? I originally thought that the issue is in me making wrong chart structure but I tried also with https://github.com/rancher/charts/tree/dev/charts/mariadb/latest and it did not work either.

This issue seems to occur when the url to the packaged chart is relative. A workaround here is to pass --url in when building the index.yaml.

Same issue on v2.2.3

helm template failed. Error: stat template-dir: no such file or directory\n: exit status 1

This issue seems to occur when the url to the packaged chart is relative. A workaround here is to pass --url in when building the index.yaml.
@drpebcak
SGTM.
But how about the kubernetes-charts?

This issue seems to occur when the url to the packaged chart is relative. A workaround here is to pass --url in when building the index.yaml.

Example to get it working on 10.0.0.100:

docker run --rm -it \
  -p 8080:8080 \
  -e STORAGE=local \
  -e STORAGE_LOCAL_ROOTDIR=/charts \
  chartmuseum/chartmuseum:latest --chart-url http://10.0.0.100:8080

Tested with rancher-2.2.3.

I am getting a similar thing
Wait helm template failed. Error: stat /cert-manager: no such file or directory : exit status 1 from https://charts.jetstack.io

Wait helm template failed. Error: stat /harbor: no such file or directory : exit status 1 from https://helm.goharbor.io

etc..

this is the same issue related to the https://github.com/rancher/rancher/issues/14905, as if you check the index.yaml file of the charts repo, you will find the url path is a relative path, should be fixed in 2.3.
image

Is this fixed in 2.3.0-alpha5?

v2.2.4 +1

v2.2.6 +1

same issue here with rancher/rancher:v2.2.6

v2.2.7 +1

Just to elaborate on bernard-wagner's reply for a workaround.
Set up a local chart museum instance and mount a directory or use the API, API had permission issues for my config.

On a Docker host

docker run --rm -it -d -v $PWD:/charts -p 8090:8080 -e STORAGE=local -e STORAGE_LOCAL_ROOTDIR=/charts chartmuseum/chartmuseum:latest --chart-url http://10.76.0.3:8090

Clone the Helm repo that's causing issues, ie Harbor

git clone https://github.com/goharbor/harbor-helm.git

Change the dir to match the chart

mv harbor-helm harbor
cd harbor

Package with Version

helm package . --version 1.1.1

Move the resulting tar.gz file to the directory that was mounted for museum chart

Then in Rancher, add the ChartMuseum, ie http://10.76.0.3:8090.

Install's will work then.

@daxmc99 before fixing, can you investigate and then summarize the bug for me?

By any chance does your catalog URL have a trailing /
In the post you mentioned

https://raw.githubusercontent.com/jasonsoft/helm/master/

Can you try with the repo url as

https://raw.githubusercontent.com/jasonsoft/helm/master

That worked for me. When I removed the trailing / my apps were able to deploy

We see this too, but our biggest concern is the tight loop of error messages that are generated. There are many per second (sometimes 10s per second) which will not only flood the log, but we've seen it to do wonky things to the etcd database causing it to exceed the 2 GB threshold. Bad things happen at that point.

[helm-controller] failed with : wait helm template failed. Error: stat /spring-hello: no such file or directory
At the very least is there any way to throttle this frequency of this error?

We also saw a huge flow of logs when an app is in an error state, but it was an other error message, so I don't think it is linked to this one specifically.
Maybe we should address this in an other issue ?

Steps to test

  1. use a chart with relative paths such as https://helm.goharbor.io (you can ensure it has a relative path by going to https://helm.goharbor.io/index.yaml and inspecting that for non-FQDNs in the url section.
  2. launch app from catalog chosen above
  3. Install the chart

The bug is reproduced in Rancher:v2.2.8

Steps:

    urls:
    - https://raw.githubusercontent.com/jasonsoft/helm/master/whitemos-0.1.0.tgz

to

    urls:
    - whitemos-0.1.0.tgz
  • run rancher:v2.2.8
  • add a cluster
  • add the above catalog
  • deploy the app whitemos from the catalog

Result:

  • the app is failed to be installed, and the following error shows in Rancher's UI
Wait helm template failed. Error: stat /whitemos: no such file or directory : exit status 1
````

<img width="1456" alt="Screen Shot 2019-08-29 at 2 02 10 PM" src="https://user-images.githubusercontent.com/6218999/63976215-afb6a100-ca65-11e9-95b4-b12e3fe172f6.png">


- the following error shows repeatedly in Rancher's log 

2019/08/29 21:03:35 [ERROR] AppController p-bbksk/whitemos [helm-controller] failed with : wait helm template failed. Error: stat /whitemos: no such file or directory
: exit status 1
2019/08/29 21:03:35 [ERROR] AppController p-bbksk/whitemos [helm-controller] failed with : wait helm template failed. Error: stat /whitemos: no such file or directory
: exit status 1
2019/08/29 21:03:35 [ERROR] AppController p-bbksk/whitemos [helm-controller] failed with : wait helm template failed. Error: stat /whitemos: no such file or directory
: exit status 1
2019/08/29 21:03:35 [ERROR] AppController p-bbksk/whitemos [helm-controller] failed with : wait helm template failed. Error: stat /whitemos: no such file or directory
: exit status 1
2019/08/29 21:03:35 [ERROR] AppController p-bbksk/whitemos [helm-controller] failed with : wait helm template failed. Error: stat /whitemos: no such file or directory
: exit status 1
```

The bug fix is validated in Rancher:master-head d7b425da7

Steps:
the same as my previous comment

Results:
The app is deployed successfully and there are no error messages in both Racnehr UI and logs

Screen Shot 2019-08-29 at 2 18 05 PM

鏈汉瑙e喅鏂规硶
https://github.com/rancher/rancher/issues/23346#issuecomment-540931296

Experiencing this issue as of today running rancher 2.2.8

Confirmed fixed in 2.3.1 here with same symptoms presented (and using Harbor 1.8.2).

I still see this on 2.2.9 with cert-manager from jetstack

And that's correct because it wasn't addressed until 2.3.x, so you need to upgrade.

I still see this on 2.2.9 with cert-manager from jetstack

After upgrading to Rancher v2.3.2 it still does not work. But the error message has changed to:
Wait helm template failed. Error: stat /cert-manager: no such file or directory : exit status 1

Can confirm it's still an issue with the cert-manager chart from Jetstack, where the URL in index.yaml looks like this:

    urls:
      - charts/cert-manager-v0.12.0.tgz

Still seeing this myself in Harbor 1.8 and Rancher 2.3. Contents of index.yaml still show relative path to chart.

See also: https://github.com/goharbor/harbor/pull/7719

By my findings: if you faced this problem once, with a specific chart version, you'll face it again, even after the upgrade to 2.3+ which does handle relative URLs better.

What's going on is that, on the previous failures, Rancher will have cached an empty directory in /var/lib/rancher/management-state/catalog-cache/SOMESHASUM/yourapp/x.y.z/. It does not ever try to overwrite that for that x.y.z. The catalog-cache even appears to survive a delete and re-install of the catalog... at least, if you name the catalog the same thing you did before.

If you have the ability to get into your Rancher container and delete that empty release directory, Rancher will then try and re-pull the .tgz and expand it, but just upgrading Rancher is not enough. Nor is restarting the Rancher container (assuming management-state is persisted with your etcd state, as is default).

mine is 2.2.10, still seeing this, what should i do?

Was this page helpful?
0 / 5 - 0 ratings