Awx: Sliced Job Template creates number of jobs as per the slicing count for the limited host

Created on 8 Dec 2018  路  6Comments  路  Source: ansible/awx

ISSUE TYPE
  • Bug Report
COMPONENT NAME

  • API
  • UI
SUMMARY

SJT creates number of jobs as per the slicing count for the limited hosts.

ENVIRONMENT
  • AWX version: X.Y.Z
  • AWX install method: docker on linux
  • Ansible version: 2.7.1
  • Operating System: CentOS
  • Web Browser: Google Chrome(Latest)
STEPS TO REPRODUCE


Create a Inventory with multiple hosts
Create a SJT with multiple slices and select the above created Inventory
Now, Limit the SJT for one of the hosts from the provided Inventory
Launch the SJT

EXPECTED RESULTS


Only one job is created for the limited hosts even if the Job Slicing value is >1.

ACTUAL RESULTS


Multiple jobs are created for the limited hosts as per the Job Slicing count.

ADDITIONAL INFORMATION


Even if multiple jobs are created, only one succeeds rest everything fails with ERROR! Specified hosts and/or --limit does not match any hosts.
sjt-2
sjt-3

sjt-1

api medium needs_devel bug

Most helpful comment

This is very similar to a problem I've raised with Red Hat support on behalf of my client - although I'm not sure what SJT is.
We find that when re-doing failed hosts there may be fewer hosts than the number of job slices.

I suggested that AWX be modified to either:

  1. Only open as many slices as there are hosts (up to the total number of slices/instances)
    OR
  2. Ensure that instances with no hosts are marked as successful - nothing to do. It is confusing/wrong that they are marked as failed.

All 6 comments

This is very similar to a problem I've raised with Red Hat support on behalf of my client - although I'm not sure what SJT is.
We find that when re-doing failed hosts there may be fewer hosts than the number of job slices.

I suggested that AWX be modified to either:

  1. Only open as many slices as there are hosts (up to the total number of slices/instances)
    OR
  2. Ensure that instances with no hosts are marked as successful - nothing to do. It is confusing/wrong that they are marked as failed.

SJT might mean Sliced Job Template.

The reason this happens is that ansible-playbook doesn't like being told to run for zero hosts; ansible-runner doesn't detect that situation (and is unwilling to change that behavior) and passes the failure upwards.

Possible approaches for a fix include

  • ask awx-runner team to reconsider
  • make it so the circumstance doesn't happen, as suggested in this bug's description, i.e. cap the parallelism to the actual number of hosts in the host limit set by AWX

Another possible idea is to allow the number to be selected with "prompt on launch." Of course that would only help with known quantities.

Another possible idea is to allow the number to be selected with "prompt on launch." Of course that would only help with known quantities.

It would also require unnecessary manual action on behalf of the user.

EDIT and as to say if a user just clicks to re run on failed hosts they may not even be aware of how many there are.

Sure, I'm thinking in the case where you might normally want it split across 3 nodes, but then need to override for 1. Seems silly to have separate workflows to control that single variable. or change/saving each time. Or in the middle of a workflow where you know you only want it on less than the normal. Not the full solution for sure, but would be handy.

Was this page helpful?
0 / 5 - 0 ratings