Azure-sdk-for-js: JS Test Guidelines Proposal

Created on 26 Jun 2019  路  8Comments  路  Source: Azure/azure-sdk-for-js

The goal of this issue is to work as a cornerstone towards the development of testing guidelines for the azure-sdk-for-js repository.

_DISCLAIMER:
Some of the information here might not be 100% up to date. I am not an expert in our project by any means. Please let me know if I'm making a mistake! Nevertheless, I hope this issue can provide value and help us reach the intended goal._

The problem

This repository holds a considerable number of JavaScript packages that all share common needs. One of the most critical needs across the board is to develop clean and reliable tests. So far, the way we do tests has specific requirements, some of them being:

  • Tests should work in several versions of NodeJS, all the major operative systems and all the major browsers.
  • Tests should be able to be executed in parallel.
  • Integration tests need a reliable way to work despite of network latency or temporary unavailability of the targeted resource.
  • Unit tests need to be as fast as possible.

Although these requirements are reasonable, to achieve them we've needed to build resources from the ground up, with no prior guidelines. Currently, these tools only exist in a handful of packages, yet they've already started to grow apart in how they are written and used.

We could more or less separate these resources in the following categories:

  • Test framework:
    In regards of the main framework used for tests, how it's configured and how the tests themselves are written with these tools.
  • Package tooling:
    Which is about our package.json, rollup configuration and environment variables.
  • Parallel execution tooling:
    Which deals with avoiding collisions on resource naming between test files and individual tests.
  • Failure prevention:
    To make sure tests can pass regardless of temporary availability issues or network latency. Currently specifically to Record & Playback and the Retry mechanism.

Below we elaborate.

Test framework

In regards of the main framework used for tests, how it's configured and how the tests themselves are written with these tools.

At the moment, we're using the libraries mocha and chai. In regards to the uses of these tools, we're currently dealing with at least the following issues:

  • Inconsistent use of versions of mocha.
  • Confusing use of after.
  • Inconsistent use of comparison functions.
  • Inconsistent separation of tests by target environment.

Let's go through them:

Inconsistent use of versions of mocha

We're currently mostly using mocha, but the KeyVault packages are using ts-mocha (I am personally responsible for this deviation. We're planing to switch back to mocha).

Solution

I will make a PR switching to mocha, here's the relevant issue: https://github.com/Azure/azure-sdk-for-js/issues/4193.

Inconsistent use of after

At the moment there's a slight misdirection in some packages regarding the use of Mocha's after. This functionality sets a given function to be executed at the end of all the available tests, yet we're calling after within a definition of an individual test, for example:

it("is an individual test", async () => {
  // Even though randomName will not be the same in any other test,
  // invoking `after` here gives the impression that `after` is going to happen at the end of the current test.
  const randomName = randomNameGenerator()
  after(() => delete(randomName));
});

It can be easy to overlook why this is working at all, so we have to keep in mind that, in each case, a unique name is used to generate the target resource, which solves any possible collision once the tests are done. Even if this is intentional, it can be confusing.

2019-06-27:
Thanks to @HarshaNalluru, we have a follow up on this matter: We're not using after() as bad as I presented it. I believe I might have used an outdated reference. See the comments below.

Solution

Since the latest update, there's not much to do here.

Inconsistent use of comparison functions

Although we're mostly using the assert library to compare values and determine whether a test can pass or not, sometimes we're using chai. Both are perfectly reasonable tools to use for this goal, but it can be better for the future to limit ourselves to a specific tool, to maintain similarity across all the packages and to perhaps reduce the surface of complexity for automation tools.

Solution

Should we make an issue to use either of the alternatives?

Inconsistent separation of tests by target environment

Since our tests can run in different environments, which include different versions of NodeJS, operative systems and browsers, we have the necessity to create a clear separation of how the tests do IO operations, as well as to provide conditional pollyfills. We can do this by either separating the tests in folders specific to the target environment (let's say, test/node and test/browser), or by making reusable common tools that can work in any environment. This could be more straightforward and more reliable everywhere if these indications were clearly stated in a centralized set of guidelines and/or in a common shared package.

Solution

I personally believe that we would benefit from having some common place of configurations for all the SDKs, specially for karma. If we could use a single configuration for every project, this configuration would force us to maintain consistency. To deploy this configuration to each package we could write some CLI tool that could make the soft link, or even make the relevant directories and then the shortlink, or even check the current folder for possible problems! It depends on how critical this is, or not.

Package tooling

Which is about our package.json, rollup configuration and environment variables.

At the moment we're using packaging tools such as package.json scripts and rollup to generate the appropriate test files by environment. This includes:

  • Custom package.json scripts.
  • Rollup configuration for NodeJS and the browser.
  • Environment variables for authentication and others which are
    specific to each package.

In this regard, our currently tested packages don't need to vary that much on how these configurations are defined. However, we have already deviated from one another since some of the tested packages don't yet support some of the target environments (KeyVault doesn't support the browser yet, so we removed most of the browser-related configurations to avoid having that much code complexity
preemptively).

On regards of environment variables, the tests currently use some agreed upon environment variable names for authentication, another one to determine whether we're on TEST_MODE record or playback and others that are specific to each package.

Solution

This will sound repetitive. What I'm seeing here is that the configurations in this case could be placed on a shared package. This package could work as a CLI tool like create-react-app, which will group common functionality under meaningful names, to be executed either from package.json scripts or potentially for the generation of the configuration files needed (perhaps related to our autorest tools for the managed packages. Actually, if we could use autorest for that, it would be amazing).

The idea is that, either with a pre-built package or a CLI tool that can be available at the root level, we could execute it to initialize the repository's configuration files and/or check wether the folders have the correct names. Even fix them! Here's some example of how the ussage of this tool could work:

$ rush test-environment -t @azure/keyvualt-secrets
* Checking the test environment in @azure/keyvault-secrets
! Problems found:
1. Configuration files are inconsistent with the standard ones. [AUTOFIXABLE]
2. Test folder should exist. Found "tests" instead. [AUTOFIXABLE]
Should I fix these issues?
Enter a number for the issue you want to fix. Enter 0 to fix all of the fixable issues.
Press ENTER to continue without fixing: 0
* Fixing inconsistent configuration files.
* Fixing the test folder.
DONE!
$ rush test-environment -$ @azure/keyvault-secrets
* Checking the test environment in @azure/keyvault-secrets
0 Problems found!
DONE!
$ rushx test
# ...

Parallel execution tooling

Which deals with avoiding collisions on resource naming between test files and individual tests.

At the moment we're avoiding collisions during the parallelization of tests mostly by generating random appendixes to the names of the resources we're creating during the tests, yet we've also deviating from each other package. The KeyVault packages are taking advantage of mocha's self-reference within the tests to define resource names. Mocha provides useful values in this.title and this.test.title that can be normalized and used towards this goal.

Note on postmortem analysis and mocha:
If something breaks NodeJS, Mocha won't help to do corrective measures since it will exit Mocha's process too. afterEach and after can only go so far, and even if the exception didn't fully kill the process, these late execution functions need persistent references to the name of the resources that they need to delete, as well as some retrying mechanism, or knowledge whether to continue attempting to delete or not. The solution here is to make sure that the remnants are traceable either by logs or by the names of the residual resources, as well as to have a meaningful retrying strategy afterwards. More on the next point.

Update after 2019-07-09:
Some progress has been made with issues like: https://github.com/Azure/azure-sdk-for-js/issues/4240

Solution

I personally think there's no right or wrong on these non-colliding names, but standardization here can go a long way. For example, these names could contain environment variables about the build that's running them: let's say, a commit hash, or the ID of the build itself, or the name of the NodeJS version or OS. This information could be used to diagnose build problems in the future by looking at the non-deleted resources.

Having that said, here are some specific recommendations:

  1. Use Mocha's this.title properties on each describe, as well as the this.test.title within each it function call to specify unique names for resources, so we can trace back if there's any problem in any of these tests individually.
  2. Provide a utility to generate the name, which could be written and used like the snippet below.
  3. Use the current date in numeric form as part of the resource name.
  4. Allow passing some unique stirng as a standard environment variable to relate the resource to the build that's being executed. Let's say this environment variable is called BUILD_ID, then we could use it as the example below.
// A sketch of the utility function:
const getResourceName = that => `${that.title}-${that.test.title}-${new Date().getTime()}`
// We could use it like:
describe("SDK's fancy feature", function () {
  const BUILD_ID = env.BUILD_ID // Assuming we're using dotenv
  it("should work", function() {
     const resourceName = `${BUILD_ID}-${getResourceName(this)}`;
     // ... the test contents.
  })
})

Failure prevention

To make sure tests can pass regardless of temporary availability issues or network latency. Currently specifically to Record & Playback and the Retry mechanism.

To prevent failure on unit tests and integration tests, we're currently hammering a set of tools. Some of them aim to avoid having to call real services and thus provide a predictable behavior for unit tests and the other tools target the possible unavailability of external resources, as well as the possibility of extended latency. These tools are namely:

  • Record and Playback.
  • The Retry Mechanism.

Record and Playback

Our current work towards Record and Playback goes deep beyond failure prevention. It can shape our tests so that we keep a single set of tests for for both unit tests and integration tests, as well as one set of tests regardless of the targeted environment. The good news is that we're currently capable of doing this thanks to the amazing work by Harsha and team, it's just not reusable across packages.

Record and Playback is tied together deeply with our configuration files. To make it work properly we need:

  • Consistent environment variables.
  • Awareness of the environment variable names and values, so that we can hide them in the contents of the recorded network calls, to avoid leaking sensitive content.
  • Awareness of the network responses individual to each package, so that we can automatically hide sensitive content (such as access tokens).
  • Consistent package.json scripts.
  • Consistent folders, so that the configuration files can be reused.
  • Repeated configuration files for each project.
  • Cross-environment support already baked in the library that we're trying to use on each recording, which includes cross-environment authentication.

Due to the individuality of each package, we're starting to deviate considerably. For example:

  • Some packages don't yet support the browsers.
  • Some packages are using delays to compensate for network latency, others are using a retrying mechanism.
  • Some packages have inconsistent folder names, which require small changes in their configurations.
  • Since network responses are not consistent, some packages require custom replacements of their recorded contents.

Solution

These deviations will only get worse unless we rethink how we're approaching this so that this desired behavior is provided across the board. Some ideas are:

  • A tool like create-react-app that would set the package folders and configuration files.
  • Guidelines regarding the environment variable names, as well as recommendations on when to start and stop recording.
  • Some centralized functionality that could provide an API to define content to be replaced on the generated recordings.
  • Perhaps even an automated way to determine whether an API response has sensitive content by analyzing the API definition.

Please remember that the intention behind this is to propose us to write "Test Guidelines".

The Retry Mechanism

Right now, only KeyVault is proposing a Retry mechanism, which consists of a single function called retry and some related types. This function however is inspired by existing packages, such as async-retry.

The proposed function follows the recipe described by @mikeharder in #4015 and it allows some optional parameters that have been based on other retry tools available on NPM. Here's an example on how to use it:

const delay = 100
const timeout = 3000
const increaseFactor = 2 // This is here to show how to do it, but we're not using it at the moment.
const result = await retry(async () => true, delay, timeout, increaseFactor);
assert.strictEqual(result, true)

~You can see the file here and the tests here.~

UPDATE 2019-06-28:
@mikeharder gave me a review and provided new code to use for the retry mechanism. The discussion can be seen here and the new code is available here: code, tests.

Either this specific tool or some other tool that achieves the same should be available for each testing suite on each package, and should probable be indicated as mandatory, specially against the current alternative: delays.

At the moment it plays well with Record and Playback due a simple abstraction as part of the recording tool which conditionally avoids any delay within the retry mechanism if we're on playback mode, as
follows:

// Full context: https://github.com/Azure/azure-sdk-for-js/pull/4054/files#diff-b749abf6206d0f0be1109701f725ab6bR47
export async function retry<T>(
  target: () => Promise<T>,
  delay: number = 10000,
  timeout: number = Infinity,
  increaseFactor: number
): Promise<T> {
  return realRetry(target, isPlayingBack ? 0 : delay, timeout, increaseFactor);
}

2019-06-27:
@ShivangiReja mentioned that we're using some retry mechanism on core-amqp:

Currently we don't use exponential retry, we use default retry attempt and detault delay between retries. We use this method for Event hubs send/receive(src). We will improve our retry mechanism for preview-2(will also use exponential retry)
https://github.com/Azure/azure-sdk-for-js/blob/master/sdk/core/core-amqp/src/retry.ts#L121

With this information, I think it's clear we need to move some shape of a retry mechanism for promises somewhere useful for our projects in general. It will solve this issue once and for all.

The comments on this issue provide more updates. We also have some useful retry policies on the core-http package.

Solution

We could abstract most of this code from each package into a single tool and then use this tool as part of each use case accordingly.


That's all that I have so far! Let's work together and make this the best it can be!

Client EngSys

All 8 comments

This is a really huge document. Thanks for your time, @sadasant! 馃巿

Inconsistent use of after

This is new to me. At least the packages that I've worked on(servicebus and the three storage packages) do not have this problem.

Inconsistent use of comparison functions

Though not so important, I agree with this. For instance, service-bus uses chai, whereas storage-packages leverage assert for comparisons.

Record and Playback

I'll probably refactor the code next week so that the common package can be re-used in various sdks, I'll discuss with you to get the diverse keyvault perspective/feedback.

The Retry Mechanism

Long ago, we had created a separate checkWithTimeout() function for servicebus tests, I'm wondering if that could also be handled as part of the refactoring for a common retry logic.

@HarshaNalluru Thank you so much for your response!!! :sun_with_face:

Regarding after, I don't see it anymore! I'm wondering where I took that from 馃 If I find it, I'll report back!

This after issue isn't the same as the one I mentioned, but it's the only one I've been able to find so far. Here, after is intended to close a client after each set of tests, but since the client is declared at the root, it should be just closing the client that was last assigned to it. We might need to double check to be sure :sun_with_face:

https://github.com/Azure/azure-sdk-for-js/blob/e7124416ee6be2ed721139106504928dd63310ff/sdk/servicebus/service-bus/test/invalidParameters.spec.ts#L42-L44

@ShivangiReja mentioned that we're using some retry mechanism on core-amqp:

Currently we don't use exponential retry, we use default retry attempt and detault delay between retries. We use this method for Event hubs send/receive( src). We will improve our retry mechanism for preview-2(will also use exponential retry)
https://github.com/Azure/azure-sdk-for-js/blob/master/sdk/core/core-amqp/src/retry.ts#L121

With this information, I think it's clear we need to move some shape of a retry mechanism for promises somewhere useful for our projects in general. It will solve this issue once and for all.

Regarding the retry mechanism, I found out that we already have some retry pipelines in the core-http code. Namely:

So, instead of using extra code for our tests we could _potentially_ use these policies at the moment of the initialization of the respective clients. However: These policies don't allow a custom detection of what an error is. They're in fact tied to either certain status codes or HTTP headers, which are specific of what's expected to fail a request by the pipelines in core-http, but that do not effectively translate to a more generalized retry mechanism.

I've already brought up these ideas to David Wilson (not mentioning him to avoid the extra spam). He said he'll think about them. I might have an update soon.

Regarding the exponential retry mechanism, we also have a custom retry code for core-amqp (perhaps for other libraries as well). See:

@danieljurek I'm adding this to the standardize tests epic. Please work with Daniel to find the right place to add this to the team wiki under the test guidelines you've been collating.

I'll close this for now. I believe further progress should be done in the READMEs of our libraries and in the recording tool we're using. Please let me know if anyone thinks otherwise.

Was this page helpful?
0 / 5 - 0 ratings