Aspnetcore: Performance improvements for Client Certificate auth

Created on 18 Jul 2019  路  14Comments  路  Source: dotnet/aspnetcore

  • Evaluate revocation list caching (Windows does it by default but other OSes?)
  • Caching validation results (could use the per-connection caching stuff in Kestrel)
Fixed Done area-security

Most helpful comment

Good news from @sebastienros on the perf front.

In the H2 scenario, 256 connections and 16 streams, looks like roughly 5x RPS gain

29K RPS without AddCertificateCache,
156K RPS with the new caching

All 14 comments

@Tratcher has ideas around using the connection feature.

You'll still need to validate the expiry on each request however.

Kestrel has a connection.Items feature that we use for caching in NegotiateAuth.
https://github.com/dotnet/aspnetcore/blob/18cb57aa3271e3e229ef9bdfe1e2b85500c53cb6/src/Security/Authentication/Negotiate/src/NegotiateHandler.cs#L374-L375
CertAuth wouldn't want to rely on this directly since we need cert auth to work with all servers, but we should be able to put in a basic cert auth caching abstraction service and do a simple implementation for Kestrel. Other implementations could use MemoryCache?

A bounded memory cache obviously :)

First draft is in, leaving issue open until we have perf numbers

@sebastienros sounds like the feature should be in daily builds, @blowdart mentioned in triage that he just wants a rough idea of when we might have numbers for a slide

I'm going through all my customer gRPC emails and there is one from a team who cares a lot of cert perf. Now that we're checked in I want some end-to-end perf numbers to send them so they will be happy with our cert auth in 5.0.

Not urgent but I don't see a reason to wait 馃槃

@davidfowl

@sebastienros Where are we on perf testing?

Good news from @sebastienros on the perf front.

In the H2 scenario, 256 connections and 16 streams, looks like roughly 5x RPS gain

29K RPS without AddCertificateCache,
156K RPS with the new caching

WOW, low hanging fruit 馃槃 . Can you run the test for a longer time than the cache period?

Looking at the code it's a sliding expiration, so it won't be refreshed in this case.

Here are some results for 60 seconds though:

| application         | baseline | cached   |         |
| ------------------- | -------- | -------- | ------- |
| CPU Usage (%)       |       98 |       91 |  -7.14% |
| Raw CPU Usage (%)   | 1,181.79 | 1,090.50 |  -7.73% |
| Working Set (MB)    |      975 |      706 | -27.59% |
| Build Time (ms)     |    4,501 |    4,504 |  +0.07% |
| Start Time (ms)     |      417 |      403 |  -3.36% |
| Published Size (KB) |  101,909 |  101,909 |   0.00% |
| Swap (MB)           |        0 |        0 |         |


| load                | baseline  | cached    |          |
| ------------------- | --------- | --------- | -------- |
| CPU Usage (%)       |        24 |        71 | +195.83% |
| Raw CPU Usage (%)   |    282.71 |    847.43 | +199.76% |
| Working Set (MB)    |       880 |       898 |   +2.05% |
| Build Time (ms)     |     4,501 |     4,501 |    0.00% |
| Start Time (ms)     |       210 |       210 |    0.00% |
| Published Size (KB) |   199,793 |   199,793 |    0.00% |
| Swap (MB)           |         0 |         0 |          |
| Max RPS             |    17,149 |    89,900 | +424.24% |
| Requests            | 1,031,656 | 5,397,319 | +423.17% |
| Bad responses       |         0 |         0 |          |
| Mean latency (ms)   |       258 |        49 |  -80.91% |
| Max latency (ms)    |     1,334 |     1,114 |  -16.45% |

NB: You can appreciate the shiny new version of the comparison table that crank generates

I think there's a 2 minute absolute expiry set on the entries, so if you try running things for 3 minutes we might see some cache expiration

3 minutes, same result:

| application         | baseline | cached   |          |
| ------------------- | -------- | -------- | -------- |
| CPU Usage (%)       |       99 |       91 |   -8.08% |
| Raw CPU Usage (%)   | 1,185.69 | 1,097.81 |   -7.41% |
| Working Set (MB)    |    1,691 |      679 |  -59.85% |
| Build Time (ms)     |    4,501 |   10,002 | +122.22% |
| Start Time (ms)     |      421 |      398 |   -5.46% |
| Published Size (KB) |  101,910 |  101,910 |    0.00% |
| Swap (MB)           |        0 |        0 |          |


| load                | baseline  | cached     |          |
| ------------------- | --------- | ---------- | -------- |
| CPU Usage (%)       |        23 |         70 | +204.35% |
| Raw CPU Usage (%)   |    281.90 |     836.00 | +196.56% |
| Working Set (MB)    |       939 |        927 |   -1.28% |
| Build Time (ms)     |     4,501 |      7,001 |  +55.54% |
| Start Time (ms)     |       209 |        205 |   -1.91% |
| Published Size (KB) |   199,793 |    199,793 |    0.00% |
| Swap (MB)           |         0 |          0 |          |
| Max RPS             |    16,625 |     86,333 | +419.29% |
| Requests            | 2,996,105 | 15,543,249 | +418.78% |
| Bad responses       |         0 |          0 |          |
| Mean latency (ms)   |       253 |         49 |  -80.74% |
| Max latency (ms)    |     1,319 |      1,020 |  -22.66% |

New options need to be documented at https://docs.microsoft.com/en-us/aspnet/core/security/authentication/certauth?view=aspnetcore-3.1 before this can be closed.

Doc PR opened, so closing this now https://github.com/dotnet/AspNetCore.Docs/pull/19195

Was this page helpful?
0 / 5 - 0 ratings