Awx: Cluster Host ID Error

Created on 9 Sep 2017  ยท  5Comments  ยท  Source: ansible/awx

Summary

Trying to run this as a distributed deployment in kubernetes. Most endpoints work but some come back with a 500 server error that states there is no instance found with the current cluster host id.

Environment

Kubernetes, latest container build from awx

Steps To Reproduce:

  • Deploy this repo to a k8s cluster: http://github.com/rossedman/ansible-awx-kubernetes
  • Wait for system to boot up
  • Login and change password, change default inventory (everything works fine)
  • Try to create a new project using git as scm, and error will be thrown.
RuntimeError: No instance found with the current cluster host id

Expected Results:

I do not understand why some endpoints work and some do not. My best guess is this has something to do with the container host names. My task container is registered as an instance in the tower group with the proper name.

Actual Results:

Certain features are available and certain ones are not. It appears I can do these things:

  • Create new credentials
  • Update user, create use, delete user
  • Update organization
  • Create/update inventory
  • Run management jobs

I cannot do these things

  • Create new project
  • Update project

Any help would be appreciated. Just trying to understand general configuration and how to run this in Kubernetes.

Thanks

Other Information

I scaled up the task agents in kubernetes and they all connected and were available to use and that didn't seem to cause any problems.

Most helpful comment

I had this problem when using a custom docker-compose.yml, with a single task container. I had:

services: 
  task:
    hostname: task
   ...

Changing to:

services:
  task:
    hostname: awx

solved this issue. But it won't help for scaling up as @matburt indicated.

b.t.w. The hostname is configured in /etc/tower/settings.py:

CLUSTER_HOST_ID = "awx"

All 5 comments

multiple container instances in a cluster is not currently supported in awx... I'm working on it though see: https://github.com/ansible/awx/issues/74

@matburt I am running exactly what is being run in docker compose though. I guess I'm having a hard time understanding where the difference would lie. Everything else seems to work so far.

There are about 5 or 6 things that I need to tweak and make better in order for this to work... starting with automatic provisioning and deprovisioning. As well as better dynamic work queue configuration ... that's why you are running into strange errors.

Get in touch with me directly on irc and we can talk more. I built out the Openshift/minishift work as a starting point.

@matburt cool sounds good. thanks for the reply. would love to chat about it more. if i can help i would be interested too.

I had this problem when using a custom docker-compose.yml, with a single task container. I had:

services: 
  task:
    hostname: task
   ...

Changing to:

services:
  task:
    hostname: awx

solved this issue. But it won't help for scaling up as @matburt indicated.

b.t.w. The hostname is configured in /etc/tower/settings.py:

CLUSTER_HOST_ID = "awx"
Was this page helpful?
0 / 5 - 0 ratings