Connexion: Crash in app.add_api() and extreme startup slowness

Created on 2 Oct 2016  路  10Comments  路  Source: zalando/connexion

Description

I am quite new to connexion (and also to Python) so, please, be lenient.
I am trying to use connexion to implement a sizable RESTful API for a custom project.

I have two distinct problems:

First and foremost I have a crash after some apparently harmless changes in my swagger definition file.
NOTE: editor.swagger.io imports both versions with "Processed with no error" green flag.
If required I can attach both versions, I didn't do it now for the sake of brevity.
In both cases App.py was generated using "Generate Server -> Python Flask" in editor.swagger.io.
The error given (see below) gives no hint (to me, at least) about real nature of problem.

Second problem is I'm trying to use this in a very low power environment (400MHz ARM9 / 256MB) and getting to the point where I get the error takes more than one minute (on a modern PC it takes less then one second); the simple "import connexion" takes about 20 sec; the "working" old version takes 3'20" to "app.add_api(...)".
Is this normal? (if so I will have to ditch connexion, unfortunately)
I suspect a large part of setup is taken by validation which, IMHO, is useful only while Developing; in production swagger.yaml file won't change, so it seems useless to revalidate it at each start.
I say this because the speed, after setup, doesn't look so terrible, but I might be very wrong, of course.

Expected bahaviour

root@emotiqfp-00099:/WorkArm/web/api-0.5/python-flask-server0$ python
Python 3.4.3 (default, Oct  2 2016, 02:57:17)
[GCC 4.9.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import connexion
>>> app = connexion.App(__name__, specification_dir='./swagger/')
>>> app.add_api('swagger.yaml', arguments={'title': 'API REST per gestione configurazione stampante fiscale da pagine WEB'})
<connexion.api.Api object at 0xb6a4a7f0>
>>> app.run(server='flask', port=8080)
 * Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)
192.168.7.78 - - [02/Oct/2016 17:34:32] "GET /api/0.5/ui/ HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:33] "GET /api/0.5/ui/css/typography.css HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:33] "GET /api/0.5/ui/css/reset.css HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:33] "GET /api/0.5/ui/css/screen.css HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:33] "GET /api/0.5/ui/css/print.css HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:33] "GET /api/0.5/ui/lib/jquery-1.8.0.min.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:33] "GET /api/0.5/ui/lib/jquery.slideto.min.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:33] "GET /api/0.5/ui/lib/jquery.wiggle.min.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:33] "GET /api/0.5/ui/lib/jquery.ba-bbq.min.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:33] "GET /api/0.5/ui/lib/handlebars-2.0.0.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:34] "GET /api/0.5/ui/lib/js-yaml.min.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:34] "GET /api/0.5/ui/lib/lodash.min.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:34] "GET /api/0.5/ui/lib/backbone-min.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:34] "GET /api/0.5/ui/swagger-ui.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:35] "GET /api/0.5/ui/lib/highlight.7.3.pack.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:35] "GET /api/0.5/ui/lib/jsoneditor.min.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:35] "GET /api/0.5/ui/lib/marked.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:35] "GET /api/0.5/ui/lib/swagger-oauth.js HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:35] "GET /api/0.5/ui/images/favicon-16x16.png HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:35] "GET /api/0.5/ui/images/logo_small.png HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:35] "GET /api/0.5/ui/fonts/DroidSans-Bold.ttf HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:36] "GET /api/0.5/ui/fonts/DroidSans.ttf HTTP/1.1" 200 -
192.168.7.78 - - [02/Oct/2016 17:34:38] "GET /api/0.5/swagger.json HTTP/1.1" 200 -

Actual behaviour

root@emotiqfp-00099:/WorkArm/web/api-0.5/python-flask-server$ python
Python 3.4.3 (default, Oct  2 2016, 02:57:17)
[GCC 4.9.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import connexion
>>> app = connexion.App(__name__, specification_dir='./swagger/')
>>> app.add_api('swagger.yaml', arguments={'title': 'API REST per gestione configurazione stampante fiscale da pagine WEB'})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/connexion/app.py", line 149, in add_api
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/connexion/api.py", line 107, in __init__
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/swagger_spec_validator/validator20.py", line 88, in validate_spec
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/swagger_spec_validator/validator20.py", line 151, in validate_apis
AttributeError: 'bool' object has no attribute 'get'
>>>

Steps to reproduce

Additional info:

Output of the commands:

  • python --version
    Python 3.4.3
  • pip show connexion | grep "^Version\:"
    Version: 1.0.112
question

Most helpful comment

Here is the code that I use to cache the Specification:

import hashlib
from logging import getLogger
import pathlib
import pickle

from connexion.spec import Specification

logger = getLogger(__name__)


class CachedSpecification(Specification):
    """Cache the built API specification.

    Building and loading our OpenAPI specification is very slow, by caching
    the result we can drastically reduce the reload time of the application.
    The cache is invalidated when the yaml file changes.
    """

    @classmethod
    def from_file(cls, spec, arguments=None):
        md5_hash = cls.md5(spec)
        cache_file = spec + '.cache'
        try:
            with open(cache_file, 'rb') as f:
                cache = pickle.load(f)
                if cache['md5_hash'] == md5_hash:
                    logger.info('Loaded spec from cache')
                    return cache['spec']
        except OSError:
            pass

        rv = cls._real_from_file(spec, arguments=arguments)
        try:
            with open(cache_file, 'wb') as f:
                cache = {
                    'md5_hash': md5_hash,
                    'spec': rv
                }
                pickle.dump(cache, f)
                logger.info('Stored spec in cache')
        except OSError as e:
            logger.warning('Could not store spec in cache: %s', e)

        return rv

    @classmethod
    def _real_from_file(cls, spec, arguments=None):
        """
        Takes in a path to a YAML file, and returns a Specification
        """
        specification_path = pathlib.Path(spec)
        spec = cls._load_spec_from_file(arguments, specification_path)
        return cls.from_dict(spec)

    @classmethod
    def md5(cls, file) -> str:
        hash_md5 = hashlib.md5()
        with open(file, 'rb') as f:
            for chunk in iter(lambda: f.read(4096), b""):
                hash_md5.update(chunk)
        return hash_md5.hexdigest()

It can be used with a monkey patch before initializing connexion and reading the yaml file:

from connexion.spec import Specification
Specification.from_file = CachedSpecification.from_file

All 10 comments

I think Connexion is not suitable for a 400MHz ARM9 / 256MB device. You could strip suff from Connexion that you think is not necessary and try to make it work.

@rafaelcaricio giving up that fast? :smirk:

I just tried out starting a Vagrant VirtualBox with 256MB of RAM and CPU cap (vb.customize ["modifyvm", :id, "--cpuexecutioncap", "10"], i.e. 10% of CPU having "Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz" on host machine) --- result is OOM while trying to compile gevent ("Running setup.py install for gevent") on first try (second try worked).

=> My connexion-example actually works fine, but is very slow of course:

vagrant@vagrant-ubuntu-trusty-64:/vagrant$ cat /proc/cpuinfo  | grep bogomips
bogomips    : 1916.92
vagrant@vagrant-ubuntu-trusty-64:/vagrant$ date && python3 app.py 
Thu Oct  6 17:55:55 UTC 2016
2016-10-06 17:55:59.637271
2016-10-06 17:56:00.558561
WARNING:connexion.operation:... OAuth2 token info URL missing. **IGNORING SECURITY REQUIREMENTS**
WARNING:connexion.operation:... OAuth2 token info URL missing. **IGNORING SECURITY REQUIREMENTS**
WARNING:connexion.operation:... OAuth2 token info URL missing. **IGNORING SECURITY REQUIREMENTS**
WARNING:connexion.operation:... OAuth2 token info URL missing. **IGNORING SECURITY REQUIREMENTS**
WARNING:connexion.operation:... OAuth2 token info URL missing. **IGNORING SECURITY REQUIREMENTS**
WARNING:connexion.operation:... OAuth2 token info URL missing. **IGNORING SECURITY REQUIREMENTS**
WARNING:connexion.operation:... OAuth2 token info URL missing. **IGNORING SECURITY REQUIREMENTS**
WARNING:connexion.operation:... OAuth2 token info URL missing. **IGNORING SECURITY REQUIREMENTS**
2016-10-06 17:56:01.487425
INFO:connexion.app:Listening on None:8080..

Running the app.py leaves us with nearly no memory:

vagrant@vagrant-ubuntu-trusty-64:/vagrant$ free -h
             total       used       free     shared    buffers     cached
Mem:          237M       227M        10M       656K        13M        81M
-/+ buffers/cache:       131M       106M
Swap:           0B         0B         0B

Thanks for the answers.
I'm fully aware my hardware is "a bit" (!) obsolete, but the times involved are really unexpected and, IMHO, would need a deeper analysis.

Worst problem, however, is I get an error I'm unable to track down: AttributeError: 'bool' object has no attribute 'get'.

How am I supposed to find the offending declaration?

Thanks again.

@mcondarelli maybe you can post your Swagger YAML here? I suppose some code in Connexion expects a dict in the place where you have a bool right now (in the YAML?).

Hi, sorry for the delay; it has been a loooong Monday.

My tentative app is quite simple:

from datetime import datetime
print('....[', datetime.now(), '] App.py: ')
import connexion
print('....[', datetime.now(), '] Connexion loaded')
app = connexion.App(__name__, specification_dir='.')
print('....[', datetime.now(), '] Loading swagger specs')
app.add_api('EmotiqP.yaml', arguments={'title': 'API REST per gestione configurazione stampante fiscale da pagine WEB'})
print('....[', datetime.now(), '] starting server')
app.run(host="0.0.0.0", port=8080)
print('...[', datetime.now(), '] .all done')

The involved (quite large) swagger definition is attached here (I had to add ".txt" to make it palatable to github) EmotiqP.yaml.txt and the results are:

Running '/var/www/api-0.5/App.py'...
....[ 2016-10-10 23:29:56.145642 ] App.py:
....[ 2016-10-10 23:30:33.007760 ] Connexion loaded
....[ 2016-10-10 23:30:33.089683 ] Loading swagger specs
Failed to add operation for POST /api/0.5/config/ImportCustomerData
Traceback (most recent call last):
  File "/var/www/api-0.5/App.py", line 7, in <module>
    app.add_api('EmotiqP.yaml', arguments={'title': 'API REST per gestione configurazione stampante fiscale da pagine WEB'})
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/connexion/app.py", line 149, in add_api
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/connexion/api.py", line 148, in __init__
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/connexion/api.py", line 220, in add_paths
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/six.py", line 686, in reraise
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/connexion/api.py", line 209, in add_paths
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/connexion/api.py", line 182, in add_operation
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/connexion/operation.py", line 175, in __init__
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/connexion/resolver.py", line 51, in resolve
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/connexion/resolver.py", line 72, in resolve_function_from_operation_id
  File "home/mcon/emotiq/Buildroot/output/target/usr/lib/python3.4/site-packages/connexion/utils.py", line 100, in get_function_from_name
ValueError: need more than 1 value to unpack
Terminated.
Finalizing... done.

As said startup time (even in versions without errors) takes more than three minutes. Lack of memory doesn't look like the reason fro slowness: CPU usage is always above 90%.

Any hint on why this is wrong (or a way to speed-up parsing) would be very welcome.

  '/config/ImportCustomerData':
    post:
      operationId: customerImport
      description: ImportCustomerData
      tags:
        - srv_PHP
        - Programming
      responses:
        '200':
          description: 'esito ok'
        default:
          description: 'errore, segue descrizione errore'
          schema:
            type: string

TiA

@mcondarelli what I see immediately: your operationId value needs to contain a dot, e.g. mymodule.my_function.

Thanks for the fast answer.

This may well be, but is strange because _all_ my operationId's are single-identifiers as swagger definition was developed for another (home brew) code generator.
Is there any reason why it blows up on that specific entry?
... or is it just a matter of the "randomizing dictionary" [mis]feature of Python?

I'll try fixing that.

For the slowness problem: breaking spec in several pieces, would that help?

Thanks.

I know this issue is old, but I had a similar issue as OP: super slow application start because the YAML file grew very big. It made the development experience quite painful and made using the app in a debugger almost impossible.

I solved the issue by adding a cache on the "compilation" of the YAML file into a Connexion Specification object. It dumps the pickled Specification into file and uses the md5 of the YAML file to detect if it can reuse the pickled data or if it needs to regenerate it. Works like a charm.

I know this issue is old, but I had a similar issue as OP: super slow application start because the YAML file grew very big. It made the development experience quite painful and made using the app in a debugger almost impossible.

I solved the issue by adding a cache on the "compilation" of the YAML file into a Connexion Specification object. It dumps the pickled Specification into file and uses the md5 of the YAML file to detect if it can reuse the pickled data or if it needs to regenerate it. Works like a charm.

Hi @NicolasLM Can u please give me more insight into ur solution? I m also facing same issue.

Thanks in advance

Here is the code that I use to cache the Specification:

import hashlib
from logging import getLogger
import pathlib
import pickle

from connexion.spec import Specification

logger = getLogger(__name__)


class CachedSpecification(Specification):
    """Cache the built API specification.

    Building and loading our OpenAPI specification is very slow, by caching
    the result we can drastically reduce the reload time of the application.
    The cache is invalidated when the yaml file changes.
    """

    @classmethod
    def from_file(cls, spec, arguments=None):
        md5_hash = cls.md5(spec)
        cache_file = spec + '.cache'
        try:
            with open(cache_file, 'rb') as f:
                cache = pickle.load(f)
                if cache['md5_hash'] == md5_hash:
                    logger.info('Loaded spec from cache')
                    return cache['spec']
        except OSError:
            pass

        rv = cls._real_from_file(spec, arguments=arguments)
        try:
            with open(cache_file, 'wb') as f:
                cache = {
                    'md5_hash': md5_hash,
                    'spec': rv
                }
                pickle.dump(cache, f)
                logger.info('Stored spec in cache')
        except OSError as e:
            logger.warning('Could not store spec in cache: %s', e)

        return rv

    @classmethod
    def _real_from_file(cls, spec, arguments=None):
        """
        Takes in a path to a YAML file, and returns a Specification
        """
        specification_path = pathlib.Path(spec)
        spec = cls._load_spec_from_file(arguments, specification_path)
        return cls.from_dict(spec)

    @classmethod
    def md5(cls, file) -> str:
        hash_md5 = hashlib.md5()
        with open(file, 'rb') as f:
            for chunk in iter(lambda: f.read(4096), b""):
                hash_md5.update(chunk)
        return hash_md5.hexdigest()

It can be used with a monkey patch before initializing connexion and reading the yaml file:

from connexion.spec import Specification
Specification.from_file = CachedSpecification.from_file
Was this page helpful?
0 / 5 - 0 ratings