Phpunit: New Test Runner

Created on 16 Jul 2018 · 18Comments · Source: sebastianbergmann/phpunit

Introduction

This ticket supersedes #10.

Thanks to @epdenouden, the reordering of tests is a solved problem. This is something that I did not think possible without rewriting the test runner. And thanks to @Idrinth, it might be possible to delay the execution of data providers without rewriting the test runner.

However, as valuable as these contributions are, I do not think that fundamentally rewriting the test runner is something that can be avoided. The world of PHP has changed a lot in the 18 years since PHPUnit's inception: a lot of work(arounds) that is (are) done while finding and loading tests alone make(s) no more sense. Assumptions have changed (global variables are no longer important, for instance) and components such as PHP-Parser and BetterReflection are now available and remove the need to really load test classes before they are needed.

At its core, the current test runner has the same architecture and design as the one released as part of PHPUnit 2. Over the years, more and more features were added, layering workaround upon workaround to overcome the test runner's limitations. For almost two decades, this test runner has served us well (at least good enough). It's time to re-think test execution and come up with a new test runner that, hopefully, serves us just as well but without causing so many headaches when it comes to maintenance and the implementation of new features.

How tests are executed today

Before the first test is executed:

Test director{y|ies} {is|are} searched recursively for files with names ending in Test.php (default)
Each such file is loaded
Classes that were not declared before the loading of such a file but are declared the loading are introspected using the Reflection API
If such a newly found class extends PHPUnit\Framework\TestCase then each of its public methods are introspected using the Reflection API
If such a public method has a name that begins with test or if the method's docblock contains a @test annotation then the method is considered a test method
If the method's docblock contains a @dataProvider annotation then the referenced data provider is executed and the next step is performed for each data set returned
An object of the test case class is created for the test method (or multiple objects, one for each data set, when @dataProvider is used)
This test object is added to a PHPUnit\Framework\TestSuite object

The above means that we have one object for each test that is to be executed before the first test is executed. These objects will remain in memory until the end of the PHP process that executed the test runner. Furthermore, data providers cannot leverage the benefits provided by generators when generators are used to implement data providers.

The above also means that data providers are executed and test objects are created even for tests that will not be executed later on because they are filtered, for instance using --filter, --group, or --exclude-group.

test-execution

The actual execution of tests is split across the PHPUnit\Framework\TestCase::run(), PHPUnit\Framework\TestResult::run(), PHPUnit\Framework\TestCase::runBare(), and PHPUnit\Framework\TestCase::runTest() methods (see sequence diagram shown above). This is confusing and should be simplified.

Another implementation aspect that should be simplified (by removing it) is PHPUnit\Framework\TestSuite. This object is a remnant from days long gone when PHPUnit was not able to search the filesystem for tests and required the manual addition of test classes to PHPUnit\Framework\TestSuite objects in code.

How tests could be executed in the future

Test director{y|ies} {is|are} searched recursively for files with names ending in Test.php (default)
Each such file is statically analysed (without actually loading it)
A PHPUnit\Runner\Test value object is created for each test method found
These value objects are collected in a PHPUnit\Runner\TestCollection
A test runner implementation iterates over the value objects of a PHPUnit\Runner\TestCollection to run the tests
Only at this point in time when a test is actually executed the test case object is created
After a test has been executed the respective test case object is destructed
If the test uses a data provider then the test runner will iterate over the data provided, creating a fresh test case object for each data set, etc.

Reordering and filtering tests, for instance, are operations that should be performed on the TestCollection after all tests have been collected and before the first test is executed.

Here is an idea for what PHPUnit\Runner\TestMethod could look like:

<?php declare(strict_types=1);

namespace PHPUnit\Runner;

final class TestMethod implements Test
{
    /**
     * @var string
     */
    private $sourceFile;

    /**
     * @var string
     */
    private $className;

    /**
     * @var string
     */
    private $methodName;

    /**
     * @var AnnotationCollection
     */
    private $classLevelAnnotations;

    /**
     * @var AnnotationCollection
     */
    private $methodLevelAnnotations;

    public function __construct(string $sourceFile, string $className, string $methodName, AnnotationCollection $classLevelAnnotations, AnnotationCollection $methodLevelAnnotations)
    {
        $this->sourceFile             = $sourceFile;
        $this->className              = $className;
        $this->methodName             = $methodName;
        $this->classLevelAnnotations  = $classLevelAnnotations;
        $this->methodLevelAnnotations = $methodLevelAnnotations;
    }

    public function sourceFile(): string
    {
        return $this->sourceFile;
    }

    public function className(): string
    {
        return $this->className;
    }

    public function methodName(): string
    {
        return $this->methodName;
    }

    public function classLevelAnnotations(): AnnotationCollection
    {
        return $this->classLevelAnnotations;
    }

    public function methodLevelAnnotations(): AnnotationCollection
    {
        return $this->methodLevelAnnotations;
    }
}

Here is an idea for how finding tests could be implemented:

<?php declare(strict_types=1);

namespace PHPUnit\Runner;

use PHPUnit\Framework\Assert;
use PHPUnit\Framework\TestCase;
use Roave\BetterReflection\BetterReflection;
use Roave\BetterReflection\Reflection\ReflectionClass;
use Roave\BetterReflection\Reflection\ReflectionMethod;
use Roave\BetterReflection\Reflector\ClassReflector;
use Roave\BetterReflection\SourceLocator\Exception\EmptyPhpSourceCode;
use Roave\BetterReflection\SourceLocator\Type\AggregateSourceLocator;
use Roave\BetterReflection\SourceLocator\Type\AutoloadSourceLocator;
use Roave\BetterReflection\SourceLocator\Type\PhpInternalSourceLocator;
use Roave\BetterReflection\SourceLocator\Type\StringSourceLocator;
use Symfony\Component\Finder\Finder;
use Symfony\Component\Finder\SplFileInfo;

final class TestFinder
{
    /**
     * @var Cache
     */
    private $cache;

    public function __construct(Cache $cache)
    {
        $this->cache = $cache;
    }

    /**
     * @throws EmptyPhpSourceCode
     * @throws \RuntimeException
     * @throws \InvalidArgumentException
     */
    public function find(array $directories): TestCollection
    {
        $tests = new TestCollection;

        foreach ($this->findTestFilesInDirectories($directories) as $file) {
            if ($this->cache->has($file->getRealPath())) {
                $testsInFile = $this->cache->get($file->getRealPath());
            } else {
                $testsInFile = $this->findTestsInFile($file);

                $this->cache->set($file->getRealPath(), $testsInFile);
            }

            $tests->addFrom($testsInFile);
        }

        return $tests;
    }

    /**
     * @throws \InvalidArgumentException
     */
    private function findTestFilesInDirectories(array $directories): Finder
    {
        $finder = new Finder;

        $finder->files()
               ->in($directories)
               ->name('*Test.php')
               ->sortByName();

        return $finder;
    }

    /**
     * @throws \RuntimeException
     * @throws EmptyPhpSourceCode
     */
    private function findTestsInFile(SplFileInfo $file): TestCollection
    {
        $tests = new TestCollection;

        foreach ($this->findClassesInFile($file) as $class) {
            if (!$this->isTestClass($class)) {
                continue;
            }

            $className             = $class->getName();
            $sourceFile            = $file->getRealPath();
            $classLevelAnnotations = $this->annotations($class->getDocComment());

            foreach ($class->getMethods() as $method) {
                if (!$this->isTestMethod($method)) {
                    continue;
                }

                $tests->add(
                    new TestMethod(
                        $sourceFile,
                        $className,
                        $method->getName(),
                        $classLevelAnnotations,
                        $this->annotations($method->getDocComment())
                    )
                );
            }
        }

        return $tests;
    }

    /**
     * @throws \RuntimeException
     * @throws EmptyPhpSourceCode
     *
     * @return ReflectionClass[]
     */
    private function findClassesInFile(SplFileInfo $file): array
    {
        $reflector = new ClassReflector($this->createSourceLocator($file->getContents()));

        return $reflector->getAllClasses();
    }

    /**
     * @throws EmptyPhpSourceCode
     */
    private function createSourceLocator(string $source): AggregateSourceLocator
    {
        $astLocator = (new BetterReflection())->astLocator();

        return new AggregateSourceLocator(
            [
                new StringSourceLocator($source, $astLocator),
                new AutoloadSourceLocator($astLocator),
                new PhpInternalSourceLocator($astLocator)
            ]
        );
    }

    private function isTestClass(ReflectionClass $class): bool
    {
        return !$class->isAbstract() && $class->isSubclassOf(TestCase::class);
    }

    private function isTestMethod(ReflectionMethod $method): bool
    {
        if (\strpos($method->getName(), 'test') !== 0) {
            return false;
        }

        if ($method->isAbstract() || !$method->isPublic()) {
            return false;
        }

        if ($method->getDeclaringClass()->getName() === Assert::class) {
            return false;
        }

        if ($method->getDeclaringClass()->getName() === TestCase::class) {
            return false;
        }

        return true;
    }

    private function annotations(string $docBlock): AnnotationCollection
    {
        $annotations = new AnnotationCollection;
        $docBlock    = (string) \substr($docBlock, 3, -2);

        if (\preg_match_all('/@(?P<name>[A-Za-z_-]+)(?:[ \t]+(?P<value>.*?))?[ \t]*\r?$/m', $docBlock, $matches)) {
            $numMatches = \count($matches[0]);

            for ($i = 0; $i < $numMatches; ++$i) {
                $annotations->add(
                    new Annotation(
                        (string) $matches['name'][$i],
                        (string) $matches['value'][$i]
                    )
                );
            }
        }

        return $annotations;
    }
}

A cache is used for avoiding the expensive static analysis for test sources that have not changed since the last execution of the test suite.

Executing a test could be as simple as this:

<?php declare(strict_types=1);

namespace PHPUnit\Runner;

final class TestMethodExecutor
{
    public function execute(TestMethod $testMethod): void
    {
        require_once $testMethod->sourceFile();

        $className  = $testMethod->className();
        $methodName = $testMethod->methodName();

        $test = new $className;

        $test->$methodName();
    }
}

Of course, this initial prototype has no support for data providers, hook methods (setUp(), @before, ...), etc. I am confident, though, that these can be implement with ease and in such a way that readability of the code does not diminish.

The proof-of-concept code is available here.

typenhancement

Source

sebastianbergmann

👍8 🎉4

Most helpful comment

Supporting lazily loaded data providers is the most wanted feature for us :)

marcospassos on 20 Feb 2019

👍4

All 18 comments

@sebastianbergmann great! I look forward to digesting (and working :) on these topics.

epdenouden on 16 Jul 2018

@sebastianbergmann this looks like a major improvement, both for readability as well as performance.

I'm wondering if the test collection shouldn't also be done as late as possible:
yield from TestFileIterator [any file ending in Test.php]
yield from TestClassIterator [any none-abstract class extending TestCase]
yield from TestMethodIterator [all test methods in class]
yield from TestCaseIterator [1 per dataset, none data provided methods just yield their single instance]

That would prevent PHPUnit from loading all the value objects(those might be pretty big) beforehand. This would lead to a huge amount of yielding of course, but likely wouldn't make the code harder to read, you'd just end up iterating over a slightly different iterator in the end - one way smaller in memory most likely.

Idrinth on 16 Jul 2018

@Idrinth That is definitely something I would like to explore once we have a "real" proof-of-concept.

sebastianbergmann on 16 Jul 2018

Probably off-topic, but still a big change... Would it be an idea to move all this to the phpunit organization on github? Primarily because:

Bus/truck/plane factor when it comes to the repositories
sebastianbergmann is rather long and difficult to type (I always reach it through google instead)
Would be more in line with phpunit/phpunit in composer

linaori on 17 Jul 2018

👍4

took the liberty of implementing a version of the discussed ideas as a tiny POC: https://github.com/Idrinth/phpunit-test-runner

getting tests as lazy as I can make it
support for @dataProvider
support for @testWith
support for @test
group filtering

The minimalistic repo should make it easier to have a quick look I hope.

Idrinth on 17 Jul 2018

I'm wondering if dropping @depends and @testWith would make sense - both should be easy enough to handle via a dataprovider annotation instead and it would lead to less code to maintain as well as better test(case) separation by default.
Not sure if I'm overlooking a benefit here?

Idrinth on 17 Jul 2018

👍2

@depends is valuable when writing integration tests, for instance, where you test a workflow step by step. It's not primarily about passing fixture from test to the next.

sebastianbergmann on 18 Jul 2018

I do not use @testWith myself. Somebody sent a pull request that did not break anything.

sebastianbergmann on 18 Jul 2018

Great work

geekcom on 18 Jul 2018

Any news on this?
Would be really awesome to finally be able to write dataProviders without ugly hacks

emulgeator on 12 Nov 2018

I'm looking forward to seeing this in the future.

Personally my biggest complaint about the current test runner comes from the inability to replace it with a custom one short of subclassing the PHPUnit\TextUI\Command class and rewriting phpunit executable with it.

Another problem, although maybe not directly caused by the runner itself, is the lack of a mechanism to register a PHPUnit Extension or Listener at runtime. For example this is useful when a listener depends on an object which itself needs constructor injection, which in some cases it's unfeasible using XML configuration alone.

metaturso on 9 Jan 2019

Supporting lazily loaded data providers is the most wanted feature for us :)

marcospassos on 20 Feb 2019

👍4

Are there plans to support parallel test execution in the new runner implementation?

rask on 26 Jun 2019

Supporting lazily loaded data providers is the most wanted feature for us :)

@marcospassos working on that for v8, see #3736

Are there plans to support parallel test execution in the new runner implementation?

@rask yes, parallel execution is something that is on my active to-do list. In v8 this could be achieved by extending the @run[..]InSeperateProcess and folding the results back into the main runner.

epdenouden on 26 Jun 2019

👍1

Quoting myself:

Every developer knows the wish "Let's throw it all away and start over on the green field!". This wish is also not foreign to me and so I have thought several times over the years to develop PHPUnit as a whole or in parts (for example "only" the test runner) from scratch. [...] So far, I have always decided against such a revolutionary approach. Firstly, because it would require a lot of work that would have to be done in one piece. Above all, however, because with such a fresh start it would be almost inevitable that all tests using PHPUnit would have to be adjusted at least minimally -- but probably significantly. For me, therefore, only an evolutionary approach makes sense, in which parts of PHPUnit are exchanged or improved step by step.

sebastianbergmann on 12 Feb 2020

@sebastianbergmann a wholly good decision IMHO. Are there any "definite" plans or ideas in place to approach this in a more evolutionary manner?

rask on 12 Feb 2020

@rask I see @localheinz has already mentioned the new event subsystem. The current TestListener implementation works just fine, however has also limited development of new functionality like the very underused extensions-API. The new event system will open the door for lots of improvements.

Myself I have quietly been cooking up a rebuild of the 'test execution reordering' functionality. It will move the business logic into the proper iterators inside the runner and remove the blackbox-like implementation. Benefits: the upcoming improvements for @dataProvider and @depends will use all that same new plumbing and wiring. Faster, cleaner and a serious drop in memory usage.

These large projects take a lot of time as the current implementation is a bit of a beast and the one-ping-only limit isn't helping either ;-)