Trying to run this as a distributed deployment in kubernetes. Most endpoints work but some come back with a 500 server error that states there is no instance found with the current cluster host id.
Kubernetes, latest container build from awx
RuntimeError: No instance found with the current cluster host id
I do not understand why some endpoints work and some do not. My best guess is this has something to do with the container host names. My task container is registered as an instance in the tower group with the proper name.
Certain features are available and certain ones are not. It appears I can do these things:
I cannot do these things
Any help would be appreciated. Just trying to understand general configuration and how to run this in Kubernetes.
Thanks
I scaled up the task agents in kubernetes and they all connected and were available to use and that didn't seem to cause any problems.
multiple container instances in a cluster is not currently supported in awx... I'm working on it though see: https://github.com/ansible/awx/issues/74
@matburt I am running exactly what is being run in docker compose though. I guess I'm having a hard time understanding where the difference would lie. Everything else seems to work so far.
There are about 5 or 6 things that I need to tweak and make better in order for this to work... starting with automatic provisioning and deprovisioning. As well as better dynamic work queue configuration ... that's why you are running into strange errors.
Get in touch with me directly on irc and we can talk more. I built out the Openshift/minishift work as a starting point.
@matburt cool sounds good. thanks for the reply. would love to chat about it more. if i can help i would be interested too.
I had this problem when using a custom docker-compose.yml, with a single task container. I had:
services:
task:
hostname: task
...
Changing to:
services:
task:
hostname: awx
solved this issue. But it won't help for scaling up as @matburt indicated.
b.t.w. The hostname is configured in /etc/tower/settings.py:
CLUSTER_HOST_ID = "awx"
Most helpful comment
I had this problem when using a custom
docker-compose.yml, with a single task container. I had:Changing to:
solved this issue. But it won't help for scaling up as @matburt indicated.
b.t.w. The hostname is configured in
/etc/tower/settings.py: