Awx: Enabling External Logging Breaks SCM Updates

Created on 3 Apr 2018  路  10Comments  路  Source: ansible/awx

ISSUE TYPE
  • Bug Report
COMPONENT NAME
  • API
  • UI
SUMMARY

After configuring external logging settings SCM updates no longer work.

ENVIRONMENT
  • AWX version: 1.0.5.18
  • AWX install method: docker on linux
  • Ansible version: 2.5.0.0
STEPS TO REPRODUCE

1) Setup a new project (I used git as an SCM source) and confirm an SCM update works
2) Configure external logging
3) Run an SCM update

EXPECTED RESULTS

SCM updates continue to work with external logging enabled

ACTUAL RESULTS

SCM updates stop working

ADDITIONAL INFORMATION
[2018-04-03 08:37:50,491: INFO/MainProcess] Received task: awx.main.scheduler.tasks.run_job_launch[b7a8b43f-5984-4e21-ab55-76a0d9cc9d5d]
[2018-04-03 08:37:50,491: DEBUG/MainProcess] TaskPool: Apply <function _fast_trace_task at 0x23c2b18> (args:(u'awx.main.scheduler.tasks.run_job_launch', u'b7a8b43f-5984-4e21-ab55-76a0d9cc9d5d', [14], {}, {u'utc': True, u'is_eager': False, u'chord': None, u'group': None, u'args': [14], u'retries': 0, u'delivery_info': {u'priority': 0, u'redelivered': False, u'routing_key': u'tower', u'exchange': u'tower'}, u'expires': None, u'hostname': 'celery@awx', u'task': u'awx.main.scheduler.tasks.run_job_launch', u'callbacks': None, u'correlation_id': u'b7a8b43f-5984-4e21-ab55-76a0d9cc9d5d', u'errbacks': None, u'timelimit': [None, None], u'taskset': None, u'kwargs': {}, u'eta': None, u'reply_to': u'0383f788-0b18-31a4-b456-5196350e2ba7', u'id': u'b7a8b43f-5984-4e21-ab55-76a0d9cc9d5d', u'headers': {}}) kwargs:{})
[2018-04-03 08:37:50,493: DEBUG/MainProcess] Task accepted: awx.main.scheduler.tasks.run_job_launch[b7a8b43f-5984-4e21-ab55-76a0d9cc9d5d] pid:231
2018-04-03 08:37:50,511 DEBUG    awx.main.scheduler Starting Scheduler
2018-04-03 08:37:50,558 DEBUG    awx.main.scheduler Starting project_update 14 (pending) in group tower (remaining_capacity=37)
2018-04-03 08:37:50,561 INFO     awx.main.scheduler Submitting project_update 14 (waiting) to instance group 1.
2018-04-03 08:37:50,595 DEBUG    awx.main.scheduler project_update 14 (waiting) consumed 1 capacity units from tower with prior total of 0
[2018-04-03 08:37:50,616: DEBUG/Worker-6] Start from server, version: 0.9, properties: {u'information': u'Licensed under the MPL.  See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2018 Pivotal Software, Inc.', u'capabilities': {u'exchange_exchange_bindings': True, u'connection.blocked': True, u'authentication_failure_close': True, u'direct_reply_to': True, u'basic.nack': True, u'per_consumer_qos': True, u'consumer_priorities': True, u'consumer_cancel_notify': True, u'publisher_confirms': True}, u'cluster_name': u'rabbit@58b137aa44ea', u'platform': u'Erlang/OTP 20.1.7', u'version': u'3.7.4'}, mechanisms: [u'AMQPLAIN', u'PLAIN'], locales: [u'en_US']
[2018-04-03 08:37:50,620: DEBUG/Worker-6] Open OK!
[2018-04-03 08:37:50,620: DEBUG/Worker-6] using channel_id: 1
[2018-04-03 08:37:50,622: DEBUG/Worker-6] Channel open
[2018-04-03 08:37:50,625: INFO/MainProcess] Received task: awx.main.tasks.run_project_update[d1e01b36-4405-486d-8c69-ca54bbe01417]
[2018-04-03 08:37:50,626: DEBUG/MainProcess] TaskPool: Apply <function _fast_trace_task at 0x23c2b18> (args:(u'awx.main.tasks.run_project_update', u'd1e01b36-4405-486d-8c69-ca54bbe01417', [14], {}, {u'utc': True, u'is_eager': False, u'chord': None, u'group': None, u'args': [14], u'retries': 0, u'delivery_info': {u'priority': None, u'redelivered': False, u'routing_key': u'tower', u'exchange': u'tower'}, u'expires': None, u'hostname': 'celery@awx', u'task': u'awx.main.tasks.run_project_update', u'callbacks': [{u'chord_size': None, u'task': u'awx.main.tasks.handle_work_success', u'args': [], u'immutable': False, u'subtask_type': None, u'kwargs': {u'task_actual': {u'type': u'project_update', u'id': 14}}, u'options': {}}], u'correlation_id': u'd1e01b36-4405-486d-8c69-ca54bbe01417', u'errbacks': [{u'chord_size': None, u'task': u'awx.main.tasks.handle_work_error', u'args': [], u'immutable': False, u'subtask_type': None, u'kwargs': {u'subtasks': [{u'type': u'project_update', u'id': 14}]}, u'options': {}}], u'timelimit': [None, None], u'taskset': None, u'kwargs': {}, u'eta': None, u'reply_to':... kwargs:{})
[2018-04-03 08:37:50,628: DEBUG/MainProcess] Task accepted: awx.main.tasks.run_project_update[d1e01b36-4405-486d-8c69-ca54bbe01417] pid:233
[2018-04-03 08:37:50,630: DEBUG/Worker-6] Closed channel #1
[2018-04-03 08:37:50,652: INFO/MainProcess] Task awx.main.scheduler.tasks.run_job_launch[b7a8b43f-5984-4e21-ab55-76a0d9cc9d5d] succeeded in 0.159662134014s: None
2018-04-03 08:37:50,756 ERROR    awx.main.tasks Error flushing job stdout and saving event count.
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/awx/main/tasks.py", line 975, in run
    stdout_handle.flush()
UnboundLocalError: local variable 'stdout_handle' referenced before assignment
2018-04-03 08:37:50,757 INFO     awx.main.tasks project_update 14 (running) Could not find scm revision in check
2018-04-03 08:37:50,843 WARNING  awx.main.tasks project_update 14 (error) encountered an error (rc=None), please see task stdout for details.
[2018-04-03 08:37:50,845: CRITICAL/MainProcess] Task awx.main.tasks.run_project_update[d1e01b36-4405-486d-8c69-ca54bbe01417] INTERNAL ERROR: TypeError('super(type, obj): obj must be an instance or subtype of type',)
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 253, in trace_task
    I, R, state, retval = on_error(task_request, exc, uuid)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 201, in on_error
    R = I.handle_error_state(task, eager=eager)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 85, in handle_error_state
    }[self.state](task, store_errors=store_errors)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/app/trace.py", line 120, in handle_failure
    task.on_failure(exc, req.id, req.args, req.kwargs, einfo)
  File "/usr/lib/python2.7/site-packages/awx/main/tasks.py", line 94, in on_failure
    super(LogErrorsTask, self).on_failure(exc, task_id, args, kwargs, einfo)
TypeError: super(type, obj): obj must be an instance or subtype of type
api high in_progress bug

Most helpful comment

Wanted to pop in and say we haven't forgotten you all - currently testing a fix that _should_ resolve this issue soon.

All 10 comments

@grahamn-gr do you have any additional details on how you have external logging configured?

@ryanpetrello , This is the config used:

{
    "LOG_AGGREGATOR_HOST": "myloggingserver.com/ansible",
    "LOG_AGGREGATOR_PORT": 10800,
    "LOG_AGGREGATOR_TYPE": "other",
    "LOG_AGGREGATOR_USERNAME": "",
    "LOG_AGGREGATOR_PASSWORD": "",
    "LOG_AGGREGATOR_LOGGERS": [
        "awx",
        "activity_stream",
        "system_tracking"
    ],
    "LOG_AGGREGATOR_INDIVIDUAL_FACTS": false,
    "LOG_AGGREGATOR_ENABLED": true,
    "LOG_AGGREGATOR_TOWER_UUID": "",
    "LOG_AGGREGATOR_PROTOCOL": "https",
    "LOG_AGGREGATOR_TCP_TIMEOUT": 5,
    "LOG_AGGREGATOR_VERIFY_CERT": true,
    "LOG_AGGREGATOR_LEVEL": "INFO"
}

I am hitting this as well with 1.0.4.80 and 1.0.4.130 tested in minishift. As soon as I set up the log aggregator, SCM updates fail and any job template ran errors with the same stacktrace as in the OP.

I have configured with "other" (going to fluentd) and "logstash" (going to logstash) with the same results.

In both instances the test log message from hitting the "test" button shows up on the log aggregator (fluentd or logstash respectively), but no other log messages seem to make it through. I've added users, created new job templates, etc but no log messages get sent even with log level set to debug.

It also doesn't matter if LOG_AGGREGATOR_ENABLED is on or off. Just that I have a config different from the default settings.

Once it is in this state, even reverting the log aggregation config doesn't clear this up. I can pretty reliably duplicate it just by doing a deployment into minishift, configuring log aggregation and trying to run even the demo job.

I'm not the original reporter, but willing to provide whatever info is needed.

Also hitting this issue in AWX 1.0.5.23, still broken after disabling external log aggregation.

Just tested this in 1.0.6.0 and seeing the same error.

Wanted to pop in and say we haven't forgotten you all - currently testing a fix that _should_ resolve this issue soon.

Hey everybody, we've just merged in some changes that we think will resolve this issue:

https://github.com/ansible/awx/commit/ac20aa954ad9471d2b67a9c55c7e85f3f47c540a

Mind giving the latest awx a try?

I have just tested with 1.0.6.8 and this does indeed seem to be fixed. Thanks!

Tested against a logstash endpoint and a fluentd endpoint, no issues.

Thanks for the check! Closing.

@jlmcknight how did you configure it to send to fluentd?

Was this page helpful?
0 / 5 - 0 ratings