Pytest: OSError('[Errno 24] Too many open files') with pytest 3.3.0

Created on 28 Nov 2017 · 10Comments · Source: pytest-dev/pytest

After upgrading to pytest 3.3.0 (previously on 3.2.5), my unitests using pytest and GitPython are failing with:

Cmd('git') not found due to: OSError('[Errno 24] Too many open files')

More precisely, all of my tests pass up to a certain number (I don't have the exact number right now), and then all the following tests start failing due to that error.

Unfortunately, I cannot share the code and tests (private projects).

Here's a partial pip list, after removing a set of private packages:

amqp (1.4.9)
anyjson (0.3.3)
APScheduler (3.2.0)
asn1crypto (0.23.0)
attrs (17.3.0)
backports.shutil-get-terminal-size (1.0.0)
beautifulsoup4 (4.6.0)
billiard (3.3.0.23)
celery (3.1.18)
celery-with-redis (3.0)
certifi (2017.11.5)
cffi (1.11.2)
chardet (3.0.4)
check-manifest (0.36)
CherryPy (3.8.2)
click (6.7)
colander (1.0)
colorama (0.3.9)
cornice (1.0.0)
coverage (4.4.2)
cryptography (2.1.3)
decorator (4.1.2)
devpi-client (2.3.0)
devpi-common (3.2.0)
enum34 (1.0.4)
fastavro (0.14.11)
funcsigs (1.0.2)
functools32 (3.2.3.post2)
future (0.14.3)
futures (3.0.5)
gitdb2 (2.0.3)
GitPython (2.1.7)
graphviz (0.8.1)
hiredis (0.2.0)
idna (2.6)
ipaddress (1.0.18)
ipdb (0.10.3)
ipython (5.5.0)
ipython-genutils (0.2.0)
iso8601 (0.1.12)
Jinja2 (2.10)
jsonschema (2.6.0)
kafka-python (0.9.5)
kazoo (2.2.1)
kombu (3.0.37)
lxml (3.4.4)
MarkupSafe (1.0)
MiniMock (1.2.8)
mock (1.0.1)
msgpack-python (0.4.6)
MySQL-python (1.2.5)
nose (1.3.7)
nose-parameterized (0.3.5)
objgraph (3.1.2)
PasteDeploy (1.5.2)
pathlib2 (2.3.0)
pbr (1.10.0)
pickleshare (0.7.4)
pip (9.0.1)
pkginfo (1.4.1)
pluggy (0.6.0)
plumbum (1.6.4)
prompt-toolkit (1.0.15)
psutil (3.1.1)
py (1.5.2)
pycparser (2.18)
pycrypto (2.6.1)
pycurl (7.19.5.1)
PyDispatcher (2.0.5)
Pygments (2.2.0)
pymongo (3.5.1)
pyOpenSSL (17.4.0)
pypiwin32 (219)
pyramid (1.5.8)
pyreadline (2.0)
pytest (3.3.0)
pytest-mock (1.6.3)
python-dateutil (2.4.2)
python-memcached (1.57)
python-memcached-stats (0.1)
pytz (2017.3)
PyYAML (3.11)
pyzmq (14.7.0)
redis (2.10.6)
repoze.lru (0.6)
requests (2.18.4)
scandir (1.6)
setuptools (28.8.0)
setuptools-scm (1.15.6)
simplegeneric (0.8.1)
simplejson (3.8.2)
six (1.11.0)
smmap2 (2.0.3)
SQLAlchemy (1.0.8)
sqlalchemy-migrate (0.11.0)
sqlparse (0.2.4)
statsd (3.2.1)
stevedore (1.20.1)
tabulate (0.7.7)
Tempita (0.5.2)
tox (2.9.1)
traitlets (4.3.2)
translationstring (1.3)
tzlocal (1.4)
urllib3 (1.22)
validators (0.10.1)
venusian (1.1.0)
virtualenv (15.1.0)
waitress (1.1.0)
wcwidth (0.1.7)
web.py (0.37)
WebOb (1.7.4)
WebTest (2.0.29)
wheel (0.29.0)
win-unicode-console (0.5)
wincertstore (0.2)
wrapt (1.10.11)
xmltodict (0.11.0)
zope.deprecation (4.3.0)
zope.interface (4.4.3)

What I can tell is that those tests perform a lot of "git clone" operation into temp directories made using the tmpdir and tmpdir_factory fixtures.
The only peculiar thing I can think of about my use case is that I have some fixtures from the "per test" level using "session" level fixtures.

Here's an almost identical copy of the conftest.py:

import os

import git
import pytest

@pytest.fixture(scope="session")
def remote_repo(tmpdir_factory):
    tmpdir = tmpdir_factory.mktemp('repos')
    tmpdir.chdir()
    os.mkdir('remote')

    # Repo init (on master by default)
    repo = git.Repo.init('remote')
    # I simplified/obfuscated a part of the git operation here.
    # lots of repo operation like making commits, creating branch and making tag...
    repo.index.commit('master 1st commit')
    repo.index.commit('master 2nd commit')
    master = repo.heads.master

    master.checkout()
    releases_1_0 = repo.create_head('releases/1.0')
    releases_1_0.checkout()
    repo.index.commit('releases 1.0 1st commit')
    repo.index.commit('releases 1.0 2nd commit')
    repo.create_tag('v1.0.0', message='v1.0.0')

    master.checkout()
    merge_base = repo.merge_base(master, releases_1_0)
    repo.index.merge_tree(releases_1_0, base=merge_base)
    repo.index.commit('1st Merge releases/1.0 into master',
                      parent_commits=(master.commit, releases_1_0.commit))

    master.checkout()
    repo.index.commit('master 3rd commit')
    repo.index.commit('master 4th commit')

    return repo


@pytest.fixture(scope="session")
def _local_repo(remote_repo):
    return remote_repo.clone(os.path.abspath('./local'))


@pytest.fixture()
def local_repo_per_session(_local_repo):
    os.chdir(_local_repo.working_dir)
    return _local_repo


@pytest.fixture()
def local_repo_per_test(tmpdir, remote_repo):
    clone = remote_repo.clone(str(tmpdir.join('./local')))
    os.chdir(clone.working_dir)
    return clone

And then I have tests using local_repo_per_session or local_repo_per_test depending on if the test is only doing reads, or if it will change the repo given by the fixture.

It feels as if the file handler for per-test level fixtures aren't closed properly anymore, but that's only my guess.

You can ask for more details about the tests, I'll try to give as much information as I can / am allowed to.

Pytest version: 3.3.0
Python: 2.7 (32 bits version)
OS: Windows 7/10

[x] Include a detailed description of the bug or suggestion
[x] pip list of the virtual environment you are using
[x] pytest and operating system versions
[x] Minimal example if possible

needs information

Source

vphilippon

Most helpful comment

Thank you for this well documented issue, I stumbled across the same thing 👍 @nicoddemus

tonybaloney on 14 Dec 2018

👍2

All 10 comments

Hi @vphilippon thanks for the detailed report.

Could you try to see which files are being kept open? ~~For Windows 10 you can use Process Explorer to see which files are being kept open by a process.~~ Actually it seems it doesn't provide that information, but it seems a custom script using psutil open_files() function might do the trick.

AFAIK tmpdir and tmpdir_factory don't keep any files open, they just create directories and the fixtures provide the path to the directory as their return value.

nicoddemus on 28 Nov 2017

Thanks for the quick reply, I'll check this tomorrow morning (EST)

vphilippon on 28 Nov 2017

👍1

Another thing to try: one new addition was the new logging plugin, try running the test suite with -p no:logging to see if that makes any difference.

nicoddemus on 28 Nov 2017

Reporting back:

Running with -p no:logging did not resolve the issue.
Using Process Explorer, I noted that a lot of git.exe process with the command line git cat-file --batch-check were running and accumulating, until the OSError occured (on pytest 3.2.5, those process would stop along the way during the tests, not accumulating).
~Using psutil, I can tell that the open_files() returns an empty list, so it's not really an issue of "opened file", but more of a "too many subprocess" issue.~ ~But I have some more testing to do here, I'll confirm this once I'm done~ ~Edit1: I completed my tests, I can confirm that this does return an empty list~ ~Edit2: Scratch that, even if I explicitly open a file, psutil.open_files() will still return an empty list. I'm doing something wrong here with my investigation.~
Edit3: I was using an old version of psutil that didn't work on Windows (10?), my bad. I'll get back on investigating this part.
The process leak only cause issues when running tests with the per-test fixture. That's not saying that the session-level fixture isn't keeping some unneeded process open, only that it really seems to be bound to the amount of git clone being performed, so running fewer git clone keep the leak at a reasonable level, if that makes sense.

Here's an example of what I get in with Process Explorer:
git-cat-file

vphilippon on 29 Nov 2017

Update on the psutil.open_files list:
On every test of a test suite run, the list of opened file is exactly:

popenfile(path='C:\\Windows\\SysWOW64\\en-US\\winnlsres.dll.mui', fd=-1)
popenfile(path='F:\\tmp\\tmpmyfgop', fd=-1)
popenfile(path='F:\\tmp\\tmprqrnxo', fd=-1)
popenfile(path='C:\\Windows\\SysWOW64\\en-US\\KernelBase.dll.mui', fd=-1)

with the same 2 temp files everytime. So nothing like something that would cause a "to many opened files" issue.
I get the same list with pytest 3.2.5.

vphilippon on 29 Nov 2017

The leaking process is not from the git clone, it's from another operation (see below).

I managed to make a reproducible case:
requirements.txt:

pytest==3.3.0
gitpython

conftest.py

import os

import git
import pytest


@pytest.fixture(scope="session")
def remote_repo(tmpdir_factory):
    tmpdir = tmpdir_factory.mktemp('repos')
    tmpdir.chdir()
    os.mkdir('remote')

    # Repo init (on master by default)
    repo = git.Repo.init('remote')
    repo.index.commit('master 1st commit')
    repo.index.commit('master 2nd commit')

    return repo


@pytest.fixture()
def local_repo_per_test(tmpdir, remote_repo):
    clone = remote_repo.clone(str(tmpdir.join('./local')))
    os.chdir(clone.working_dir)
    return clone

test.py

import pytest

@pytest.mark.parametrize('foo', ['bar',]*200)
def test_foobar(foo, local_repo_per_test):
    # This will cause the process leak.
    local_repo_per_test.head.reference = local_repo_per_test.commit('origin/master')

vphilippon on 29 Nov 2017

Thanks @vphilippon for following up!

I can reproduce the issue, and I see it is related to https://github.com/pytest-dev/pytest/issues/2968#issuecomment-348016955: you are relying on the garbage collector to implicitly close the Repo object.

If you explicitly close it the process leak goes away:

@pytest.fixture(scope="session")
def remote_repo(tmpdir_factory):
    tmpdir = tmpdir_factory.mktemp('repos')
    tmpdir.chdir()
    os.mkdir('remote')

    # Repo init (on master by default)
    repo = git.Repo.init('remote')
    repo.index.commit('master 1st commit')
    repo.index.commit('master 2nd commit')
    yield repo
    repo.close()


@pytest.fixture()
def local_repo_per_test(tmpdir, remote_repo):
    clone = remote_repo.clone(str(tmpdir.join('./local')))
    os.chdir(clone.working_dir)
    yield clone
    clone.close()

This is definitely a bug in pytest that we intend to fix, but either way explicitly closing your resources is considered good-practice. 👍

nicoddemus on 30 Nov 2017

Well color me surprised, I didn't even knew the Repo object was a closeable resource.
As someone once said: "Learn to love your bugs, as you will learn from them."
Thanks a lot @nicoddemus! 👍
I'll go ahead and close this one as the root issue is already tracked.

vphilippon on 30 Nov 2017

Great, thanks @vphilippon for patience and understanding!

nicoddemus on 30 Nov 2017

Thank you for this well documented issue, I stumbled across the same thing 👍 @nicoddemus

tonybaloney on 14 Dec 2018

👍2

Was this page helpful?

0 / 5 - 0 ratings