Our Dockerized ASP.NET Core application implements validation of client certificates by using API from System.Security.Cryptography.X509Certificates namespace. One of the certificate validation checks is whether client certificate is present in revocation list. Everything works as expected and client certificates can be successfully validated. Also CRL are getting cached in file system of a container during the first request to our API.
As per our requirements, CRL might contain around 15K revoked certificates. In order to check that we run set of performance tests for CRL of different size (CRL size is the only variant in our tests) - 2 certificates, 5000, 10000 and 15000 certificates in CRL. The results of the tests are below
| # | Number of certificates | Time relative to the first run |
|---|---|---|
| 1 | 2 | 1 |
| 2 | 5000 | 1.1 |
| 3 | 10000 | 1.5 |
| 4 | 15000 | 4.2 |
As per this small set of results, time to validate CRL looks exponential.
Are there any performance recommendations for CRL validation? Was current CRL validation implementation in System.Security.Cryptography.X509Certificates namespace designed to handle large number of certificates in CRL?
The implementation of client certificate validation is below.
``` C#
public bool IsValid(X509Certificate2 certificate)
{
var chain = new X509Chain
{
ChainPolicy = new X509ChainPolicy
{
RevocationMode = X509RevocationMode.Online,
RevocationFlag = X509RevocationFlag.EntireChain,
UrlRetrievalTimeout = TimeSpan.FromSeconds(5)
}
};
if (!chain.Build(certificate))
{
return false;
}
return true;
}
```
What is the unit on your time measurement? ms?
And have you tried v3.1 rather than v2.1?
1 unit of time is around 50 ms in our case - test case with CRL with 2 certificates was execute in 50 ms (average value based on 3 runs).
We did not run performance tests for v.3.1. Were there any performance improvements introduced in v3.0 or v3.1 for revocation lists?
Were there any performance improvements introduced in v3.0 or v3.1?
Yes, there were extensive perf improvements across the stack in .NET Core 3.0, e.g. https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-core-3-0/. In this area specifically, https://github.com/dotnet/corefx/pull/35367 for example overhauled X509Chain processing.
As per this small set of results, time to validate CRL looks exponential.
Did 15,000 validate in 4.2ms altogether, or each cert? What is the goal for your app?
The CRL processing (both understanding the files, and doing the revocation check) is performed by OpenSSL (on Linux, which your docker container is using); so most of the work that depends on the size of the CRL will be there.
Without peeking too much under the covers, the flow should be akin to:
```C#
foreach (Crl crl in candidateCrls)
{
if (!crl.Issuer.Equals(cert.Issuer))
{
continue;
}
foreach (CrlEntry entry in crl.Entries)
{
if (entry.SerialNumber.Equals(cert.SerialNumber))
{
return Revoked;
}
}
return OK;
}
return Unknown;
```
The only part that depends on the size of the CRL is serial number matching, which is an O(N) operation (of an O(m) byte comparison, but m is small and unrelated to N). So, hand-wavingly, it should be linear with size. (Peeking a little bit, it looks like they might qsort it first; if so, it'd be O(N log N)).
If you are just doing a casual run once test (with Stopwatch or the UNIX time command), then it seems quite plausible that you happen to just hit a point where the process has to ask the OS for memory between 10k and 15k elements. If you're seeing that steady-state then maybe something weird is happening.
I don't see any obvious CRL best practice documentation, since it's in the fine art of CA management and proprietary heuristics. As the CA you can certainly let perf influence your practices, such as either rolling to a new issuing CA periodically or just using multiple CRL distribution points (named for the year/month of issue, or rolled over every 10k certs, etc).
Did 15,000 validate in 4.2ms altogether, or each cert? What is the goal for your app?
If you are just doing a casual run once test (with Stopwatch or the UNIX time command), then it seems quite plausible that you happen to just hit a point where the process has to ask the OS for memory between 10k and 15k elements. If you're seeing that steady-state then maybe something weird is happening.
We are using Gatling application (https://gatling.io/) to run performance tests of our application. Currently, we choose the simplest API method that does one request to database per API request and ran set of performance tests with CRL of different size. Each request to this API method validates client certificate including check whether certificate is revoked. Also as mentioned above, CRL is getting cached in container's file system (https://stackoverflow.com/questions/55653143/is-there-a-way-to-check-and-clean-certificate-revocation-list-cache-for-asp-net).
Times in the result table are relative to the first run for very small CRL (it is our baseline). Each performance run: 20 requests per second for 10 minutes. Time presented in the result table is measured by Gatling application as response time of our API.
Yes, there were extensive perf improvements across the stack in .NET Core 3.0, e.g. https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-core-3-0/. In this area specifically, dotnet/corefx#35367 for example overhauled X509Chain processing.
This is quite important information (especially https://github.com/dotnet/corefx/pull/35367). We will do migration to v3.1 and perform tests again.
Thank you for proving this information
We did migration to aspnet.core v3.1 and rerun the same set of perfromance tests. The results of the tests are below
| # | Number of certificates | Time relative to the first run |
|---|---|---|
| 1 | 2 | 1 |
| 2 | 5000 | 1.1 |
| 3 | 10000 | 1.4 |
| 4 | 15000 | 2.1 |
Since we see improvements, this issue can be closed.
Thank you very much for your help!
Most helpful comment
We did migration to aspnet.core v3.1 and rerun the same set of perfromance tests. The results of the tests are below
| # | Number of certificates | Time relative to the first run |
|---|---|---|
| 1 | 2 | 1 |
| 2 | 5000 | 1.1 |
| 3 | 10000 | 1.4 |
| 4 | 15000 | 2.1 |
Since we see improvements, this issue can be closed.
Thank you very much for your help!