Awx: Analytics fail to upload if the size is greater than 100MB

Created on 20 Jul 2020  路  6Comments  路  Source: ansible/awx

ISSUE TYPE
  • Bug Report
SUMMARY

A customer's analytics uploads were failing since the bundle size was greater than 100MB. This is due to an upload size limit in the upload service. Tower should upload bundles smaller than 100MB by building a bundle and if it is too big then divide the time period in half and send two bundles instead. Recurse until small enough bundles are reached.

ENVIRONMENT
  • AWX version: X.Y.Z
  • AWX install method: openshift, minishift, docker on linux, docker for mac, boot2docker
  • Ansible version: X.Y.Z
  • Operating System:
  • Web Browser:
STEPS TO REPRODUCE
EXPECTED RESULTS
ACTUAL RESULTS
ADDITIONAL INFORMATION
api medium bug

Most helpful comment

The simplest way to do this is probably split how we gather the data and do uploads.

  • generate full upload minus job events
  • check the last submission date
  • for every day since prior submission, create a daily dump of job events, and send each separately

All 6 comments

The simplest way to do this is probably split how we gather the data and do uploads.

  • generate full upload minus job events
  • check the last submission date
  • for every day since prior submission, create a daily dump of job events, and send each separately

That will probably be less CPU and memory intensive on the query side. Paginating the events and sending a constant number in a larger number of bundles would be better for the sender and the receiver.

functional example of how to split at ship time https://gist.github.com/jctanner/db5a4d5e6a054ac0468ef09ed42ea276

Tested with local api and did not see any errors, but also didn't validate the data was processed correctly.

fastapi_1           | {"@timestamp": "2020-07-22T19:34:22.438Z", "@version": 1, "source_host": "4c2715477e42", "name": "tower_analytics_report.processor.tower_analytics_processor", "args": [], "levelname": "INFO", "levelno": 20, "pathname": "./tower_analytics_report/processor/tower_analytics_processor.py", "filename": "tower_analytics_processor.py", "module": "tower_analytics_processor", "stack_info": null, "lineno": 840, "funcName": "handle_events_table", "created": 1595446462.438589, "msecs": 438.58909606933594, "relativeCreated": 8097716.229200363, "thread": 140240822705480, "threadName": "MainThread", "processName": "SpawnProcess-1", "process": 32, "message": "handle_events_table.merge_encrypted"}
fastapi_1           | {"@timestamp": "2020-07-22T19:34:22.440Z", "@version": 1, "source_host": "4c2715477e42", "name": "tower_analytics_report.processor.tower_analytics_processor", "args": [], "levelname": "INFO", "levelno": 20, "pathname": "./tower_analytics_report/processor/tower_analytics_processor.py", "filename": "tower_analytics_processor.py", "module": "tower_analytics_processor", "stack_info": null, "lineno": 855, "funcName": "handle_events_table", "created": 1595446462.440438, "msecs": 440.43803215026855, "relativeCreated": 8097718.078136444, "thread": 140240822705480, "threadName": "MainThread", "processName": "SpawnProcess-1", "process": 32, "message": "handle_events_table.mapping"}
fastapi_1           | {"@timestamp": "2020-07-22T19:34:22.441Z", "@version": 1, "source_host": "4c2715477e42", "name": "tower_analytics_report.processor.tower_analytics_processor", "args": [], "levelname": "INFO", "levelno": 20, "pathname": "./tower_analytics_report/processor/tower_analytics_processor.py", "filename": "tower_analytics_processor.py", "module": "tower_analytics_processor", "stack_info": null, "lineno": 864, "funcName": "handle_events_table", "created": 1595446462.4415565, "msecs": 441.556453704834, "relativeCreated": 8097719.196557999, "thread": 140240822705480, "threadName": "MainThread", "processName": "SpawnProcess-1", "process": 32, "message": "handle_events_table.insert"}
fastapi_1           | {"@timestamp": "2020-07-22T19:34:22.443Z", "@version": 1, "source_host": "4c2715477e42", "name": "tower_analytics_report.processor.tower_analytics_processor", "args": [], "levelname": "INFO", "levelno": 20, "pathname": "./tower_analytics_report/processor/tower_analytics_processor.py", "filename": "tower_analytics_processor.py", "module": "tower_analytics_processor", "stack_info": null, "lineno": 916, "funcName": "handle_events_table", "created": 1595446462.4432404, "msecs": 443.2404041290283, "relativeCreated": 8097720.880508423, "thread": 140240822705480, "threadName": "MainThread", "processName": "SpawnProcess-1", "process": 32, "message": "handle_events_table.commit"}
fastapi_1           | {"@timestamp": "2020-07-22T19:34:22.448Z", "@version": 1, "source_host": "4c2715477e42", "name": "uvicorn.access", "args": ["172.19.0.1:56830", "POST", "/api/tower-analytics/upload_bundle/", "1.1", 202], "levelname": "INFO", "levelno": 20, "pathname": "/usr/local/lib/python3.6/site-packages/uvicorn/protocols/http/httptools_impl.py", "filename": "httptools_impl.py", "module": "httptools_impl", "stack_info": null, "lineno": 454, "funcName": "send", "created": 1595446462.4488645, "msecs": 448.8644599914551, "relativeCreated": 8097726.504564285, "thread": 140240822705480, "threadName": "MainThread", "processName": "SpawnProcess-1", "process": 32, "status_code": 202, "scope": {"type": "http", "http_version": "1.1", "server": ["172.19.0.15", 8000], "client": ["172.19.0.1", 56830], "scheme": "http", "method": "POST", "root_path": "/api/tower-analytics", "path": "/upload_bundle/", "raw_path": "b'/api/tower-analytics/upload_bundle/'", "query_string": "b''", "headers": [["b'host'", "b'192.168.122.1:8004'"], ["b'accept-encoding'", "b'identity'"], ["b'user-agent'", "b'Red Hat Ansible Tower 3.7.1 (enterprise)'"], ["b'content-length'", "b'3363'"], ["b'content-type'", "b'multipart/form-data; boundary=94d6ccd9edf6590a6d6495534d138b35'"], ["b'authorization'", "b'Basic U0FNSUFNOmJhcg=='"]], "app": "<fastapi.applications.FastAPI object at 0x7f8c5b80f898>", "state": {"engine": "Engine(postgres://debug:***@postgres:5432/tenant_1)", "decrypt": "<function EngineMiddleware.dispatch.<locals>.decrypt at 0x7f8c5733a598>"}, "router": "<fastapi.routing.APIRouter object at 0x7f8c59d082b0>", "path_params": {}, "app_root_path": "", "endpoint": "<function upload_bundle at 0x7f8c59b609d8>"}, "message": "172.19.0.1:56830 - \"POST /api/tower-analytics/upload_bundle/ HTTP/1.1\" 202"}
refresher_1         | refresh_module_count_by_date_and_cluster_mview
refresher_1         | failed_task_count_by_date_and_template_mview
refresher_1         | job_event_count_by_date_and_org_mview
refresher_1         | job_state_count_by_date_org_cluster_and_template_mview
refresher_1         | hosts_by_date_and_org_mview
refresher_1         | roi_templates_mview

This customer here is me by the way :)

Happy to run any tests if it helps

since merge of ansible/awx#7709 analytics tests are busted because changes basic presumption about whats in the tarball it seems

Initial problems w/ #7709 have been solved.

Now bundles are less than 100MB:

[root@ip-10-0-2-31 ~]# du -h tar*
201M    tar1
204K    tar2
201M    tar3
14M     tar4
[root@ip-10-0-2-31 ~]# du -h /tmp/ea48ba97-21c2-4ff5-b42c-4100a6178d38_2020-09-16-214001+0000_*
24K     /tmp/ea48ba97-21c2-4ff5-b42c-4100a6178d38_2020-09-16-214001+0000_0.tar.gz
34M     /tmp/ea48ba97-21c2-4ff5-b42c-4100a6178d38_2020-09-16-214001+0000_1.tar.gz
34M     /tmp/ea48ba97-21c2-4ff5-b42c-4100a6178d38_2020-09-16-214001+0000_2.tar.gz
2.4M    /tmp/ea48ba97-21c2-4ff5-b42c-4100a6178d38_2020-09-16-214001+0000_3.tar.gz

Going to call this verified and fixed for devel. We can discuss if more backports are needed.

Was this page helpful?
0 / 5 - 0 ratings