Amp-wp: Adding continuous performance testing with Blackfire

Created on 27 Sep 2019  Â·  5Comments  Â·  Source: ampproject/amp-wp

The proposed solution for adding robust manual and continuous profiling to monitor and improve the performance of the plugin is to go with the Blackfire Profiler. This offers very robust and easy to analyse profiling as shown during a brief screenshare in our recent Plugin Sync meeting.

Any developer can run manual Blackfire tests at any time as needed. This can even be done with the free "Hack" plan, albeit with less features.

The required license for integrating Blackfire directly with GitHub and have it run profiling for every PR is the Enterprise license This comes at $289 / month billed yearly, so a total of $3468 per year.

How to run Blackfire continuously

There are two main ways of running Blackfire continuously:
a.) Using HTTP access and the Blackfire Player to run Blackfire on a website we have deployed in some way
b.) Using the Blackfire PHP SDK to run performance tests against the source code.
c.) Integrate Blackfire directly into the PHPUnit tests to use asserts based on Blackfire metrics (like asserting that the SQL queries are less than 10).

For a first iteration, I suggest concentrating on b.) and c.) only, as this is way more straight-forward to implement and maintain, and will provide us with a large chunk of the benefits.

Once we're in a good place with our performance tests using the PHP SDK, we can discuss what site(s) to deploy, and where to deploy them, so we can run Blackfire against entire sites. This is then similar to the e2e tests, with the difference that they will profile the backend performance while controlling the frontend.

Steps needed to integrate Blackfire continuous profiling into this plugin using the PHP SDK:

  • [ ] Add a separate env to Travis for performance testing.
  • [ ] Commit an encrypted file .blackfire.travis.ini.enc to the repository that contains the encrypted Blackfire credentials (see Travis integration docs)
  • [ ] Adapt the travis file to download and configure Blackfire on before_install and to disable XDebug and launch the Blackfire agent on before_script (see Travis integration docs)
  • [ ] Write one or more scenario(s) that regroup multiple profile tests and assemble them into a build (see Scenarios & Builds docs).
  • [ ] When creating the build, the 'external_id' should be the SHA1 of the pull request, and the 'external_parent_id' should be the SHA1 of the base branch of the pull request. This is needed so that we can send a notification about the build status back to GitHub (see Enabling the Update of Git Commit Statuses docs).
  • [ ] Hook up the Blackfire Build configuration to the GitHub notification channel (see Setting up the GitHub Notification Channel docs).

How will the scenarios look like?

Here's an example of how scenarios and builds look like. Note that this is PHP code, and can therefore be made as DRY as we want.

We'd have at least 1 build that gets triggered by pull requests, and that build should test multiple scenarios.

Note: the following code is untested.

$blackfire = new Blackfire\Client();

$build = $blackfire->startBuild( 'AMP WP Plugin', [
    'title'              => 'Build from Travis',
    'trigger_name'       => 'pull-request',
    'external_id'        => getenv( 'TRAVIS_COMMIT' ),
    'external_parent_id' => getenv( 'TRAVIS_PULL_REQUEST_SHA' ) . ':' . getenv( 'TRAVIS_BRANCH' ),
] );

$config = ( new Blackfire\Profile\Configuration() )
    // We can define how many samples to profile to average out fluctuations.
    ->setSamples( $samples )
    // We can have multiple environments to store the results in.
    ->setEnv('amp-wp');

// For each scenario, we adapt the configuration object.
$scenario = $blackfire->startScenario( $build, [
    'title'    => 'Tag & Attribute Sanitizer',
    'metadata' => [
        'pull-request' => getenv( 'TRAVIS_PULL_REQUEST' ),
        'category'     => 'sanitizer',
    ],
] );
$config->setScenario( $scenario );

// In PHP, we can manually control the probe and only enable it
// for the parts of the code we want to profile.
$probe = $blackfire->createProbe( $config, false );

for ( $sample = 1; $sample <= $samples; $sample++ ) {
    // Start the actual profile run.
    $probe->enable();

    foo(); // The code we want to profile.

    // Finish the profile run.
    $probe->close();
}

// Send the results back to Blackfire.
$profile = $blackfire->endProbe( $probe );

// We need to close the scenario now to start the next.
// This returns the report, in case we want to act on it here.
$report = $blackfire->closeScenario( $scenario );

// After we went through all scenarios, we can close the build.
$blackfire->closeBuild( $build );

What about the PHPUnit integration

Within PHPUnit, we can use Blackfire for assertions. We can assert againt the dimensions of any metric. The available dimensions for metrics are the following ones:

  • count
  • wall_time
  • cpu_time
  • memory
  • peak_memory
  • network_in
  • network_out
  • io

For comparisons, the following two functions can be used as well:

  • percent() - i.e. percent(main.wall_time) < 10%
  • diff() - i.e. diff(metrics.sql.queries.count) < 2

Apart from the built-in metrics, we can define our own custom metrics that we assert the dimensions against. Here's an example of how that could work:

use Blackfire\Profile\Metric;

$metric = new Metric( 'content_sanitizer.sanitize', '=AMP_Content_Sanitizer::sanitize' );

// Then we can add this custom metric to our profile's config object.
$config->defineMetric( $metric );

Now, let's see how we could use this metric in an assertion when running PHPUnit tests.

Note: the following code is untested.

use Blackfire\Bridge\PhpUnit\TestCaseTrait;
use Blackfire\Profile;

class AMP_Img_Sanitizer_Test extends WP_UnitTestCase
{
    use TestCaseTrait;

    /** @var Blackfire\Profile\Configuration */
    private $config;

    public function setUp() {
        $this->config = new Blackfire\Profile\Configuration();

        $metric = new Metric(
            'content_sanitizer.sanitize',
            '=AMP_Content_Sanitizer::sanitize'
        );
        $this->config->defineMetric( $metric );
    }

    /**
     * @group blackfire
     * @requires extension blackfire
     */
    public function testSomething()
    {
        // First we need to define our assertions.
        $this->config
            ->assert('content_sanitizer.sanitize.wall_time < 200ms', 'Content sanitization time' )
            ->assert('content_sanitizer.sanitize.memory < 2MB', 'Content sanitiztaion memory' )
            ->assert('content_sanitizer.sanitize.io < 5ms', 'Content sanitization I/O' )
;

        // Then we can do a profile run to see whether they hold true.
        $profile = $this->assertBlackfire( $config, function () {
            // Here we run the code that needs to be profiled.
        } );
    }
}

One way of using these asserts is to define performance budgets for the different subsystems and then make sure we can actually hit these budgets and enforce them.

Nice tip I gathered from the docs:
When defining custom metrics, you can also reason about the argument that is being passed in. This is most useful if we have place in the code where multiple code paths flow through based on differing arguments. You can define the metric to create separate nodes for differing arguments that were passed in. This lets us verify whether we run a method for a given argument multiple times (which could then be cached) and whether there are very slow instances of doing so. Additionally, it lets us filter to only take said method into account for specific arguments, like getting a detailed profile for all actions/filters where the first arguments starts with 'amp_'.

What about HTTP access using the Blackfire Player (option a.) above)

For this to work, we'd need to deploy a site in such a way that it is accessible via HTTP to Blackfire. This could mean a docker container we prepare within Travis (not 100% sure on timing here), or an external hosting we deploy to.

Blackfire comes with a built-in integration for [platform.sh], and that has a "development-only" plan at $10 / month. However, I would prefer to concentrate on options b.) and c.) above, and discuss in parallel on first defining what the "reference site(s)" for AMP should be that we want to run the tests against. Without reference sites to run the Blackfire Player tests against, it makes no sense to invest time and money into this side of the infrastructure.

Fixes #1017

Groomed Infrastructure P1 Performance Task Perf

Most helpful comment

As part of an analysis I've been doing on CSS usage across the most popular themes on WordPress.org, I've come up with a “monster” post that incorporates all content from posts in the unit test data, fully populates all widget areas in the active theme, and assigns nav menus to all locations. I think it would be a good candidate to use for keeping tabs on performance of the sanitizers and the post-processing phase.

Monster post in Twenty Twenty: https://github.com/westonruter/amp-wp-theme-compat-analysis/blob/master/twentytwenty-monster.html

All 5 comments

WP is known to run remote queries to, say, Jetpack or WP's own servers, from time to time as part of normal requests, the CPU load and network congestion may vary and affect timings which is why page loads could vary dramatically across WP reloads. How does using Blackfire deal with these factors?

If you do a profile run that includes these (as opposed to a profile run for a pure logic function or subsystem), then Blackfire will take these into account. They will be seen under the "Network" metric, and will be part of the "Wall time". However, you can still use the "CPU" metric, for example, to see the pure processing time.

The tests that actually run assertions should only be done against deterministic subsets of the code, like testing the sanitizer on a fixed piece of content.

The required license for integrating Blackfire directly with GitHub and have it run profiling for every PR is the Enterprise license This comes at $289 / month billed yearly, so a total of $3468 per year.

I wonder if we want this to run with every PR? We turned off code coverage unless a build is running on develop since Xdebug greatly slowed down build times.

As part of an analysis I've been doing on CSS usage across the most popular themes on WordPress.org, I've come up with a “monster” post that incorporates all content from posts in the unit test data, fully populates all widget areas in the active theme, and assigns nav menus to all locations. I think it would be a good candidate to use for keeping tabs on performance of the sanitizers and the post-processing phase.

Monster post in Twenty Twenty: https://github.com/westonruter/amp-wp-theme-compat-analysis/blob/master/twentytwenty-monster.html

GitHub Actions integration is available now: https://blog.blackfire.io/github-actions-support-for-blackfire.html

Was this page helpful?
0 / 5 - 0 ratings