If a workflow runs after couple of jobs it hangs with the following error and new jobs are not spawned anymore.
I run my workflow with more than 5 job templates. After 5 Job templates no new Jobs are started and in the task container log i recieve the following error.
see https://pastebin.com/kBsKWtAw
jobs should run fine without any errors at it was before we changed to the new awx version
Errors from above, Hanging job and any other job that is spawning after this point, even outside of the workflow won't start.
@oekotaco this should be resolved by https://github.com/ansible/awx/pull/2645 (in the next version of awx we release).
Thanks for reporting it.
Thanks i've tryed your patch. Now i got the following error.
Thanks for your quick response.
@oekotaco yep, it looks like there's another bug here related to a recent change we merged. Looking into it...
@oekotaco yep, it looks like there's another bug here related to a recent change we merged. Looking into it...
Is there a new issue open for the second bug yet?
@oekotaco could you give me any more information to reproduce the 2nd error you gave in the pastbin text? Was it by the same steps where you had a workflow with 5 JTs?
Did the Workflow job template have variables set? Did the JTs prompt for variables? Were they all root nodes? Did you launch it manually?
@oekotaco could you give me any more information to reproduce the 2nd error you gave in the pastbin text? Was it by the same steps where you had a workflow with 5 JTs?
Did the Workflow job template have variables set? Did the JTs prompt for variables? Were they all root nodes? Did you launch it manually?
I can add some information here, seeing the same issue after seeing the original issue and applying the linked patch
Did the Workflow job template have variables set? - Yes
Did the JTs prompt for variables? - No
Were they all root nodes? - No
Did you launch it manually? - Yes and No. Looks like every job 'hangs' with this error now no matter how it was started.
I tried to replicate this, no luck so far. I am certainly staring at this piece of code, trying to think of any way it might have gone wrong.
I just can't figure out how this can happen right now.
for field_name, value in kwargs.items():
if field_name not in valid_fields:
raise Exception('Unrecognized launch config field {}.'.format(field_name))
So obviously there is a field name in items that is not viewed as a valid field. I threw in some print statements -
for field_name, value in kwargs.items():
if field_name not in valid_fields:
print("Field Names are {} - Valid Fields are {}".format(field_name, valid_fields))
raise Exception('Unrecognized launch config field {}.'.format(field_name))
Here is what I get (tl;dr looks like there are no valid returned names eventually) -
This failed in a workflow that ends with an Inventory sync.
Field Names are extra_vars - Valid Fields are []
2018-11-20 21:19:37,192 WARNING awx.main.models.unified_jobs Fields extra_vars are not allowed as overrides to spawn from ECS_Inventory_Source_Script-32.
2018-11-20 21:19:38,264 ERROR awx.main.dispatch Worker failed to run task awx.main.scheduler.tasks.run_task_manager([], {}
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/awx/main/dispatch/worker/task.py", line 84, in perform_work
result = self.run_callable(body)
File "/usr/lib/python2.7/site-packages/awx/main/dispatch/worker/task.py", line 56, in run_callable
return _call(args, *kwargs)
File "/usr/lib/python2.7/site-packages/awx/main/scheduler/tasks.py", line 25, in run_task_manager
TaskManager().schedule()
File "/usr/lib/python2.7/site-packages/awx/main/scheduler/task_manager.py", line 543, in schedule
finished_wfjs = self._schedule()
File "/usr/lib/python2.7/site-packages/awx/main/scheduler/task_manager.py", line 529, in _schedule
self.spawn_workflow_graph_jobs(running_workflow_tasks)
File "/usr/lib/python2.7/site-packages/awx/main/scheduler/task_manager.py", line 119, in spawn_workflow_graph_jobs
job = spawn_node.unified_job_template.create_unified_job(kv)
File "/usr/lib/python2.7/site-packages/awx/main/models/inventory.py", line 1536, in create_unified_job
return super(InventorySource, self).create_unified_job(*kwargs)
File "/usr/lib/python2.7/site-packages/awx/main/models/unified_jobs.py", line 392, in create_unified_job
unified_job.create_config_from_prompts(kwargs, parent=self)
File "/usr/lib/python2.7/site-packages/awx/main/models/unified_jobs.py", line 897, in create_config_from_prompts
raise Exception('Unrecognized launch config field {}.'.format(field_name))
Exception: Unrecognized launch config field extra_vars.
2018-11-20 21:19:37,192 WARNING awx.main.models.unified_jobs Fields extra_vars are not allowed as overrides to spawn from ECS_Inventory_Source_Script-32.
Well that's interesting. I'll have to try again some permutations with inventory sources in workflows.
Thanks for digging into this.
Sorry for the late response. Same Workflow as described by Mhurron at https://github.com/ansible/awx/issues/2642#issuecomment-440031086
And also after the last successfull running job is a inventory sync.
job2 -> inventory sync -> job4
job1->
job3 -> inventory sync -> job5
From 1-3 every thing is fine. but at inventory sync it fails and persists to all other new launched jobs.
If you have something new to test. Let me know and i will test it for you.
I tried this network, still not reproducing with that

Hi the Error still exsits.
I created a new workflow and template with variables. I used the latest docker images.
2018-11-27 08:10:03,285 ERROR awx.main.dispatch Worker failed to run task awx.main.scheduler.tasks.run_task_manager([], {}
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/awx/main/dispatch/worker/task.py", line 84, in perform_work
result = self.run_callable(body)
File "/usr/lib/python2.7/site-packages/awx/main/dispatch/worker/task.py", line 56, in run_callable
return _call(args, **kwargs)
File "/usr/lib/python2.7/site-packages/awx/main/scheduler/tasks.py", line 25, in run_task_manager
TaskManager().schedule()
File "/usr/lib/python2.7/site-packages/awx/main/scheduler/task_manager.py", line 543, in schedule
finished_wfjs = self._schedule()
File "/usr/lib/python2.7/site-packages/awx/main/scheduler/task_manager.py", line 529, in _schedule
self.spawn_workflow_graph_jobs(running_workflow_tasks)
File "/usr/lib/python2.7/site-packages/awx/main/scheduler/task_manager.py", line 119, in spawn_workflow_graph_jobs
job = spawn_node.unified_job_template.create_unified_job(kv)
File "/usr/lib/python2.7/site-packages/awx/main/models/unified_jobs.py", line 347, in create_unified_job
six.text_type(', ').join(unallowed_fields), unified_job, self
UnboundLocalError: local variable 'unified_job' referenced before assignment
2018-11-27 08:10:03,287 DEBUG awx.main.dispatch task 84503261-6c50-4a32-97bf-17c18d4ec0aa is finished
In our case it looks like a bit more complex. But at all it just hangs before it starts to do the inventory sync. Inventory comes from openstack source.

Hope this helps. If not let me know.
Is it possible that #2642 (comment) is not yet included in the docker image of ansible/awx_task ?
Is it possible that #2642 (comment) is not yet included in the docker image of ansible/awx_task ?
Yes. Its not in the release yet. I applied this patch manually. But it won't fix the underlying issue.
@oekotaco Thanks very much for following along with me debugging this.
I have reproduced the issue that you reported, I did this by using set_stats in the jobs that ran upstream of the inventory updates, which is compatible with the information you have given about your own setup.
The primary subject of this issue is another bug which was resolved separately, so I filed a new issue for your report at https://github.com/ansible/awx/issues/2806
Most helpful comment
The primary subject of this issue is another bug which was resolved separately, so I filed a new issue for your report at https://github.com/ansible/awx/issues/2806