Running the same command on an image that hasn't been downloaded is much slower with the SDK than with the CLI. Does the CLI have access to some other cache than the SDK? Are the two tests not really comparable? I'm new with this, so apologies if I'm not using it correctly.
$ pip freeze | grep docker && python --version && docker version
docker==2.1.0
docker-pycreds==0.2.1
Python 2.7.13 :: Continuum Analytics, Inc.
Client:
Version: 17.03.0-ce
API version: 1.26
Go version: go1.7.5
Git commit: 60ccb22
Built: Thu Feb 23 10:40:59 2017
OS/Arch: darwin/amd64
Server:
Version: 17.03.0-ce
API version: 1.26 (minimum version 1.12)
Go version: go1.7.5
Git commit: 3a232c8
Built: Tue Feb 28 07:52:04 2017
OS/Arch: linux/amd64
Experimental: true
$ sw_vers -productVersion
10.11.6 # MacOS
import docker
import unittest
from subprocess import Popen, PIPE
class DockerTests(unittest.TestCase):
def test_hello_world_sdk(self):
# Slow if image not already downloaded
client = docker.from_env()
output = client.containers.run("ubuntu", "echo hello world")
self.assertEqual(output, 'hello world\n')
def test_hello_world_popen(self):
# Even if download is necessary, runs in a couple seconds
p = Popen(['docker', 'run', 'ubuntu', 'echo', 'hello', 'world'],
stdout=PIPE)
output = p.stdout.read()
self.assertEqual(output, 'hello world\n')
Using the SDK is fast if the image has already been downloaded:
$ python -m unittest -v docker_engine_app.tests.DockerTests.test_hello_world_sdk
test_hello_world_sdk (docker_engine_app.tests.DockerTests) ... ok
----------------------------------------------------------------------
Ran 1 test in 1.557s
OK
But much slower starting from scratch:
$ docker rmi ubuntu
Untagged: ubuntu:latest
$ python -m unittest -v docker_engine_app.tests.DockerTests.test_hello_world_sdk
test_hello_world_sdk (docker_engine_app.tests.DockerTests) ... ok
----------------------------------------------------------------------
Ran 1 test in 57.183s
OK
But if the CLI needs to download the image, it isn't anywhere near as slow.
$ docker rmi ubuntu
Untagged: ubuntu:latest
$ python -m unittest -v docker_engine_app.tests.DockerTests.test_hello_world_popen
test_hello_world_popen (docker_engine_app.tests.DockerTests) ... Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
Digest: sha256:dd7808d8792c9841d0b460122f1acf0a2dd1f56404f8d1e56298048885e45535
Status: Downloaded newer image for ubuntu:latest
ok
----------------------------------------------------------------------
Ran 1 test in 2.298s
OK
The difference in behavior can be isolated to the pull:
def test_hello_world_sdk_with_cli_pull(self):
client = docker.from_env()
call(['docker', 'pull', 'ubuntu'])
output = client.containers.run("ubuntu", "echo hello world")
self.assertEqual(output, 'hello world\n')
def test_hello_world_sdk_with_sdk_pull(self):
client = docker.from_env()
client.images.pull('ubuntu')
output = client.containers.run("ubuntu", "echo hello world")
self.assertEqual(output, 'hello world\n')
$ docker rmi ubuntu
Untagged: ubuntu:latest
$ python -m unittest -v docker_engine_app.tests.DockerTests.test_hello_world_sdk_with_cli_pull
test_hello_world_sdk_with_cli_pull (docker_engine_app.tests.DockerTests) ... Using default tag: latest
latest: Pulling from library/ubuntu
Digest: sha256:dd7808d8792c9841d0b460122f1acf0a2dd1f56404f8d1e56298048885e45535
Status: Downloaded newer image for ubuntu:latest
ok
----------------------------------------------------------------------
Ran 1 test in 2.188s
OK
md5-918d4cfd25c62b93effae74a86ae2082
$ docker rmi ubuntu
Untagged: ubuntu:latest
$ python -m unittest -v docker_engine_app.tests.DockerTests.test_hello_world_sdk_with_sdk_pull
test_hello_world_sdk_with_sdk_pull (docker_engine_app.tests.DockerTests) ... ok
----------------------------------------------------------------------
Ran 1 test in 63.027s
OK
docker pull ubuntu is actually translated into docker pull ubuntu:latest. Same thing for your rmi command which only untags the ubuntu:latest image (you probably have ubuntu:16.04 tagged as well which prevents the CLI for actually removing the associated layers). So your CLI command is very fast because it's not actually downloading any new data, just checking that the tag matches the version you already have locally and re-tagging it accordingly.
On the other hand, the API (and the Python API client) when asked to pull ubuntu, actually pulls the entire repository (all images tagged in the official ubuntu repository, of which there are a lot).
If you change your code to use equivalent pull commands, I believe you will see comparable execution times:
def test_hello_world_sdk_with_cli_pull(self):
client = docker.from_env()
call(['docker', 'pull', 'ubuntu'])
output = client.containers.run("ubuntu", "echo hello world")
self.assertEqual(output, 'hello world\n')
def test_hello_world_sdk_with_sdk_pull(self):
client = docker.from_env()
client.images.pull('ubuntu:latest')
output = client.containers.run("ubuntu", "echo hello world")
self.assertEqual(output, 'hello world\n')
Thanks! Yes: Changing client.run('ubuntu', 'echo hello world') to client.run('ubuntu:latest', 'echo hello world') fixes this, even without an explicit pull. The difference in behavior surprises me coming from the command line, but it's probably just me.
In your defense, our SDK docs say "similar to docker pull" - it should probably clarify in which ways it's not similar!
Anyway, glad I could help!
I still see docker pull in bash running much faster than the python code. I guess its multithreaded?
Most helpful comment
docker pull ubuntuis actually translated intodocker pull ubuntu:latest. Same thing for yourrmicommand which only untags theubuntu:latestimage (you probably have ubuntu:16.04 tagged as well which prevents the CLI for actually removing the associated layers). So your CLI command is very fast because it's not actually downloading any new data, just checking that the tag matches the version you already have locally and re-tagging it accordingly.On the other hand, the API (and the Python API client) when asked to pull
ubuntu, actually pulls the entire repository (all images tagged in the officialubunturepository, of which there are a lot).If you change your code to use equivalent pull commands, I believe you will see comparable execution times: