When launching a Job Template usually the ansible output is shown and updated interactively.
After two days of not using the system (pod was running) I came back to awx.
Now the output has stopped showing when launching a Job Template and only after reloading the web page shows the ansible output.
Restarting the pod does not help.
Could not
The ansible output is actively shown in the output panel.
The output is only showing up after reloading the page but still does not follow actively.
Output web container:
[pid: 29\|app: 0\|req: 32/121] 10.1.0.1 () {54 vars in 2589 bytes} [Fri Sep 22 12:29:03 2017] GET /api/v2/jobs/149/job_events/?order_by=start_line&or__event__in=playbook_on_start,playbook_on_play_start,playbook_on_task_start,playbook_on_stats => generated 52 bytes in 125 msecs (HTTP/1.1 200) 8 headers in 242 bytes (1 switches on core 0)
--
聽 | 10.1.0.1 - - [22/Sep/2017:12:29:03 +0000] "GET /api/v2/jobs/149/job_events/?order_by=start_line&or__event__in=playbook_on_start,playbook_on_play_start,playbook_on_task_start,playbook_on_stats HTTP/1.1" 200 63 "https://awx.appagile.io/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36" "100.125.192.61"
聽 | 10.1.0.1 - - [22/Sep/2017:12:29:03 +0000] "OPTIONS /api/v2/jobs/149/job_events/ HTTP/1.1" 200 10754 "https://awx.appagile.io/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36" "100.125.192.63"
聽 | [pid: 28\|app: 0\|req: 48/122] 10.1.0.1 () {60 vars in 2494 bytes} [Fri Sep 22 12:29:03 2017] OPTIONS /api/v2/jobs/149/job_events/ => generated 10741 bytes in 200 msecs (HTTP/1.1 200) 8 headers in 242 bytes (1 switches on core 0)
聽 | [pid: 28\|app: 0\|req: 49/123] 127.0.0.1 () {38 vars in 568 bytes} [Fri Sep 22 12:29:16 2017] GET /api/v1/inventories/2/script/?hostvars=1 => generated 2328 bytes in 135 msecs (HTTP/1.1 200) 6 headers in 193 bytes (1 switches on core 0)
聽 | 127.0.0.1 - - [22/Sep/2017:12:29:17 +0000] "GET /api/v1/inventories/2/script/?hostvars=1 HTTP/1.1" 200 2340 "-" "python-requests/2.14.2" "-"
聽 | 2017-09-22 12:29:29,452 DEBUG awx.api.authentication User admin performed a POST to /api/v1/inventories/ through the API
聽 | 2017-09-22 12:29:29,468 WARNING awx.api.generics status 400 received by user admin attempting to access /api/v1/inventories/ from 127.0.0.1
聽 | [pid: 28\|app: 0\|req: 50/124] 127.0.0.1 () {40 vars in 567 bytes} [Fri Sep 22 12:29:29 2017] POST /api/v1/inventories/ => generated 73 bytes in 117 msecs (HTTP/1.1 400) 6 headers in 208 bytes (1 switches on core 0)
聽 | 127.0.0.1 - admin [22/Sep/2017:12:29:29 +0000] "POST /api/v1/inventories/ HTTP/1.1" 400 84 "-" "ansible-httpget" "-"
One error appears in the celery container, but I don't know if this is related - Output celery container:
Traceback (most recent call last):
--
聽 | File "/usr/bin/ansible", line 43, in <module>
聽 | import ansible.constants as C
聽 | File "/usr/lib/python2.7/site-packages/ansible/constants.py", line 202, in <module>
聽 | DEFAULT_LOCAL_TMP = get_config(p, DEFAULTS, 'local_tmp', 'ANSIBLE_LOCAL_TEMP', '~/.ansible/tmp', value_type='tmppath')
聽 | File "/usr/lib/python2.7/site-packages/ansible/constants.py", line 109, in get_config
聽 | makedirs_safe(value, 0o700)
聽 | File "/usr/lib/python2.7/site-packages/ansible/utils/path.py", line 71, in makedirs_safe
聽 | raise AnsibleError("Unable to create local directories(%s): %s" % (to_native(rpath), to_native(e)))
聽 | ansible.errors.AnsibleError: Unable to create local directories(/.ansible/tmp): [Errno 13] Permission denied: '/.ansible'
This seems like it could be a websocket service issue.
This is managed through daphne for us, can you check up on that service and see if it has any problems? Is your browser making/establishing a websocket connection successfully?
Are there any log files I can check? The container logs don't show anything websocket related.
And how can I see if my browser is establishing a websocket connection?
Edit: Took a look in the webdeveloper tools, there seems to be an issue with the websocket, that is in "pending" state. I dissabled all my antivir and firewall software just to make sure this isn't an issue.

It will typically stay in the pending state while it's connected since it's a persistent connection. If you click on that does it provide more details?
cc @jaredevantabor on this possible websocket service issue
@balpert89 Do you get any error messages on the webdev console? Does the connection ever connect, perhaps after a refresh? "Pending" means that it's connected, if you click on that row in your network tab of the dev console and then go to "Frames" do you see any messages coming/going over the wire?
Having this issue. I observe these messages in the log:
127.0.0.1:49228 - - [06/Oct/2017:01:23:23] "GET /websocket/" 503 470
127.0.0.1:49602 - - [06/Oct/2017:01:23:55] "GET /websocket/" 503 470
127.0.0.1:49864 - - [06/Oct/2017:01:24:29] "GET /websocket/" 503 470
127.0.0.1:52762 - - [06/Oct/2017:01:35:54] "GET /websocket/" 503 470
127.0.0.1:52770 - - [06/Oct/2017:01:35:58] "GET /websocket/" 503 470
127.0.0.1:52780 - - [06/Oct/2017:01:36:04] "GET /websocket/" 503 470
127.0.0.1:52974 - - [06/O2017/10/06 01:36:08 [error] 31#0: *1874 connect() failed (111: Connection refused) while connecting to upstream, client: 172.17.0.1, server: _, request: "GET /websocket/ HTTP/1.0", upstream: "http://[::1]:8051/websocket/", host: "localhost:8052"
2017/10/06 01:36:08 [warn] 31#0: *1874 upstream server temporarily disabled while connecting to upstream, client: 172.17.0.1, server: _, request: "GET /websocket/ HTTP/1.0", upstream: "http://[::1]:8051/websocket/", host: "localhost:8052"
2017/10/06 01:36:10 [error] 31#0: *1882 connect() failed (111: Connection refused) while connecting to upstream, client: 172.17.0.1, server: _, request: "OPTIONS /api/v2/inventories/2/inventory_sources/ HTTP/1.0", upstream: "uwsgi://[::1]:8050", host: "localhost:8052", referrer: "https://awx.redacted.net/"
2017/10/06 01:36:10 [warn] 31#0: *1882 upstream server temporarily disabled while connecting to upstream, client: 172.17.0.1, server: _, request: "OPTIONS /api/v2/inventories/2/inventory_sources/ HTTP/1.0", upstream: "uwsgi://[::1]:8050", host: "localhost:8052", referrer: "https://awx.redacted.net/"
Other observations:
127.0.0.1:8051localhost:8051getent hosts localhost inside the awx_web container yields ::1 localhost ip6-localhost ip6-loopbacktelnet localhost 8051 first tries ::1, fails, then tries 127.0.0.1 -- I suspect nginx isn't doing the same, and just trying the IPv6 address.The easy fix might be to change nginx config to use explicit IPv4 127.0.0.1 instead of localhost for upstreams.
I changed the nginx file myself, but now I see lines like this in my log:
2017/10/07 01:41:02 [error] 32#0: *91 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 172.17.0.1, server: _, request: "GET /websocket/ HTTP/1.0", upstream: "http://127.0.0.1:8051/websocket/", host: "localhost:8052"
172.17.0.1 - - [07/Oct/2017:01:41:02 +0000] "GET /websocket/ HTTP/1.0" 504 585 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36" "-"
I think my problem was that I had an additional Nginx proxy in front of AWX providing SSL termination. I had to add additional handling for the websocket.
location / {
proxy_pass http://localhost:8052;
}
location /websocket {
proxy_pass http://localhost:8052;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade"
}
That definitely would have been good information to have at the beginning.