Awx: installer/kubernetes/templates/deployment.yml.j2 cpu and memory mismatch + limit supported only with apiVersion: extensions/v1beta2

Created on 15 Mar 2018  路  5Comments  路  Source: ansible/awx

ISSUE TYPE
  • Bug Report
COMPONENT NAME
  • Installer
SUMMARY

in installer/kubernetes/templates/deployment.yml.j2

issues introduced with following commit https://github.com/ansible/awx/commit/b0cf4de07224902992de5ad9bb957cd94855f759#diff-d975205ae64de1b20726919b50ffcc75

  • limits are only supported since apiVersion: extensions/v1beta2
  • cpu and memory mismatch on lines 55-59
  • please cast variables to integer
  • additionally, requirements should be adapted in the documentation: At leasts 6GB of memory, At least 3 cpu cores
  • additionally, organisation in a single pod makes it impossible to have components running on separate nodes of a kubernetes cluster
STEPS TO REPRODUCE

exexute the install playbook on a kubernetes cluster

EXPECTED RESULTS
TASK [kubernetes : Apply Deployment] ******************************************************************************************************************************************************************************
changed: [vars]
ACTUAL RESULTS
  • limits are only supported since apiVersion: extensions/v1beta2
TASK [kubernetes : Apply Deployment] ******************************************************************************************************************************************************************************
fatal: [vars]: FAILED! => {"changed": true, "cmd": "kubectl apply -f /tmp/awx-config/deployment.yml", "delta": "0:00:00.255991", "end": "2018-03-15 11:46:32.239584", "msg": "non-zero return code", "rc": 1, "start": "2018-03-15 11:46:31.983593", "stderr": "error: error validating \"/tmp/awx-config/deployment.yml\": error validating data: ValidationError(Deployment.spec.template.spec.containers[1].resources): unknown field \"limit\" in io.k8s.api.core.v1.ResourceRequirements; if you choose to ignore these errors, turn validation off with --validate=false", "stderr_lines": ["error: error validating \"/tmp/awx-config/deployment.yml\": error validating data: ValidationError(Deployment.spec.template.spec.containers[1].resources): unknown field \"limit\" in io.k8s.api.core.v1.ResourceRequirements; if you choose to ignore these errors, turn validation off with --validate=false"], "stdout": "", "stdout_lines": []}
  • cpu and memory mismatch on lines 55-59
0/3 nodes are available: 1 Insufficient memory, 1 PodToleratesNodeTaints, 3 Insufficient cpu.
  • Variable not casted as an integer:
TASK [kubernetes : Template Kubernetes AWX Config] ****************************************************************************************************************************************************************
fatal: [vars]: FAILED! => {"changed": false, "msg": "AnsibleError: Unexpected templating type error occurred on (apiVersion: v1\nkind: ConfigMap\nmetadata:\n  name: awx-config\n  namespace: {{ awx_kubernetes_namespace }}\ndata:\n  secret_key: {{ awx_secret_key }}\n  awx_settings: |\n    import os\n    import socket\n    ADMINS = ()\n    \n    # Container environments don't like chroots\n    AWX_PROOT_ENABLED = False\n\n    # Automatically deprovision pods that go offline\n    AWX_AUTO_DEPROVISION_INSTANCES = True\n\n    SYSTEM_TASK_ABS_CPU = {{ ((awx_task_cpu_request|default(1500) / 1000) * 4)|int }}\n    SYSTEM_TASK_ABS_MEM = {{ ((awx_task_mem_request|default(2) * 1024) / 100)|int }}\n\n    #Autoprovisioning should replace this\n    CLUSTER_HOST_ID = socket.gethostname()\n    SYSTEM_UUID = '00000000-0000-0000-0000-000000000000'\n\n    SESSION_COOKIE_SECURE = False\n    CSRF_COOKIE_SECURE = False \n\n    REMOTE_HOST_HEADERS = ['HTTP_X_FORWARDED_FOR']\n    \n    STATIC_ROOT = '/var/lib/awx/public/static'\n    PROJECTS_ROOT = '/var/lib/awx/projects'\n    JOBOUTPUT_ROOT = '/var/lib/awx/job_status'\n    SECRET_KEY = file('/etc/tower/SECRET_KEY', 'rb').read().strip()\n    ALLOWED_HOSTS = ['*']\n    INTERNAL_API_URL = 'http://127.0.0.1:8052'\n    SERVER_EMAIL = 'root@localhost'\n    DEFAULT_FROM_EMAIL = 'webmaster@localhost'\n    EMAIL_SUBJECT_PREFIX = '[AWX] '\n    EMAIL_HOST = 'localhost'\n    EMAIL_PORT = 25\n    EMAIL_HOST_USER = ''\n    EMAIL_HOST_PASSWORD = ''\n    EMAIL_USE_TLS = False\n    \n    LOGGING['handlers']['console'] = {\n        '()': 'logging.StreamHandler',\n        'level': 'DEBUG',\n        'formatter': 'simple',\n    }\n    \n    LOGGING['loggers']['django.request']['handlers'] = ['console']\n    LOGGING['loggers']['rest_framework.request']['handlers'] = ['console']\n    LOGGING['loggers']['awx']['handlers'] = ['console']\n    LOGGING['loggers']['awx.main.commands.run_callback_receiver']['handlers'] = ['console']\n    LOGGING['loggers']['awx.main.commands.inventory_import']['handlers'] = ['console']\n    LOGGING['loggers']['awx.main.tasks']['handlers'] = ['console']\n    LOGGING['loggers']['awx.main.scheduler']['handlers'] = ['console']\n    LOGGING['loggers']['django_auth_ldap']['handlers'] = ['console']\n    LOGGING['loggers']['social']['handlers'] = ['console']\n    LOGGING['loggers']['system_tracking_migrations']['handlers'] = ['console']\n    LOGGING['loggers']['rbac_migrations']['handlers'] = ['console']\n    LOGGING['loggers']['awx.isolated.manager.playbooks']['handlers'] = ['console']\n    LOGGING['handlers']['callback_receiver'] = {'class': 'logging.NullHandler'}\n    LOGGING['handlers']['fact_receiver'] = {'class': 'logging.NullHandler'}\n    LOGGING['handlers']['task_system'] = {'class': 'logging.NullHandler'}\n    LOGGING['handlers']['tower_warnings'] = {'class': 'logging.NullHandler'}\n    LOGGING['handlers']['rbac_migrations'] = {'class': 'logging.NullHandler'}\n    LOGGING['handlers']['system_tracking_migrations'] = {'class': 'logging.NullHandler'}\n    LOGGING['handlers']['management_playbooks'] = {'class': 'logging.NullHandler'}\n    \n    DATABASES = {\n        'default': {\n            'ATOMIC_REQUESTS': True,\n            'ENGINE': 'django.db.backends.postgresql',\n            'NAME': \"{{ pg_database }}\",\n            'USER': \"{{ pg_username }}\",\n            'PASSWORD': \"{{ pg_password }}\",\n            'HOST': \"{{ pg_hostname|default('postgresql') }}\",\n            'PORT': \"{{ pg_port }}\",\n        }\n    }\n    BROKER_URL = 'amqp://{}:{}@{}:{}/{}'.format(\n        \"awx\",\n        \"abcdefg\",\n        \"localhost\",\n        \"5672\",\n        \"awx\")\n    CHANNEL_LAYERS = {\n        'default': {'BACKEND': 'asgi_amqp.AMQPChannelLayer',\n                    'ROUTING': 'awx.main.routing.channel_routing',\n                    'CONFIG': {'url': BROKER_URL}}\n    }\n    CACHES = {\n        'default': {\n            'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',\n            'LOCATION': '{}:{}'.format(\"localhost\", \"11211\")\n        },\n        'ephemeral': {\n            'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',\n        },\n    }\n): unsupported operand type(s) for /: 'unicode' and 'int'"}

  • Attempt to run on a cluster with nodes with 4GB of memory and 2 cpu cores
0/3 nodes are available: 1 Insufficient memory, 1 PodToleratesNodeTaints, 3 Insufficient cpu.
ADDITIONAL INFORMATION

````
-apiVersion: extensions/v1beta1
+apiVersion: extensions/v1beta2
kind: Deployment
metadata:
name: awx
@@ -52,11 +52,11 @@ spec:
value: {{ default_admin_password|default('password') }}
resources:
requests:

  • memory: "{{ awx_task_cpu_request|default('2') }}Gi"
  • cpu: "{{ awx_task_mem_request|default('1500') }}m"
  • cpu: "{{ awx_task_cpu_request|default('1500')|int }}m"
  • memory: "{{ awx_task_mem_request|default('2')|int }}Gi"
    limit:
  • memory: "{{ awx_task_cpu_request|default('2') }}Gi"
  • cpu: "{{ awx_task_mem_request|default('1500') }}m"
  • cpu: "{{ awx_task_cpu_request|default('1500')|int }}m"
  • memory: "{{ awx_task_mem_request|default('2')|int }}Gi"
    - name: awx-rabbit
    image: ansible/awx_rabbitmq:{{ rabbitmq_version }}
    imagePullPolicy: Always
    ````
installer high bug

Most helpful comment

wrong analyse from me, the issue with limt is not apiVersion, but that it should be limits (with an s).

All 5 comments

in installer/kubernetes/templates/configmap.yml.j2 also this:

-    SYSTEM_TASK_ABS_CPU = {{ ((awx_task_cpu_request|default(1500) / 1000) * 4)|int }}
-    SYSTEM_TASK_ABS_MEM = {{ ((awx_task_mem_request|default(2) * 1024) / 100)|int }}
+    SYSTEM_TASK_ABS_CPU = {{ ((awx_task_cpu_request|default(1500)|int / 1000) * 4)|int }}
+    SYSTEM_TASK_ABS_MEM = {{ ((awx_task_mem_request|default(2)|int * 1024) / 100)|int }}

We'd appreciate a PR with the changes you've suggested.

Sorry my intention was not to be condescending. Installer was working yesterday, it is broken today. After fixing it, an awx instance that was running yesterday on a kubernetes cluster with small nodes with 2 cpus and 4 Gi of memory cannot start anymore today, it is quite frustrating. I spent the morning to search for solutions to get it working again, and I try to help by pushing back my findings. I thought that opening a ticket was the good approach. I thought also it would be more friendly to explain things in the ticket before proposing a PR.

For limit it was the first error that occured:

unknown field \"limit\" in io.k8s.api.core.v1.ResourceRequirement

I found out that the error disappears when I set:

apiVersion: extensions/v1beta2

By mismatch i mean in the memory line the cpu variables is used and vice versa.

memory: "{{ awx_task_cpu_request|default('2') }}Gi"

So if you set awx_task_cpu_request=1500, then you will need 1500Gi of memory and it is not the expected result.

wrong analyse from me, the issue with limt is not apiVersion, but that it should be limits (with an s).

alrighty I think I have this fixed up in #1571 ... check it out and let me know what you think.

Was this page helpful?
0 / 5 - 0 ratings