Describe the bug:
During step 'Calling GetChallenge' cert-manager uses 'Challenge Accept/Ready' URL and not the authorization URL.
Cert-manager works with letsencrypt but fail when trying interop-tests with Nexus-CM. One theory is that the challenge URL in someway replaces the authorization URL after retrieving the available challenge types.
Expected behaviour:
Continous polling using GetChallenge with the authorization url while the challenge is prepared, followed by a 'Calling Accept' using the challenge url and waiting for the server to validate the challenge.
From RFC 8555:
7.5.1. Responding to Challenges
To prove control of the identifier and receive authorization, the
client needs to provision the required challenge response based on
the challenge type and indicate to the server that it is ready for
the challenge validation to be attempted.The client indicates to the server that it is ready for the challenge
validation by sending an empty JSON body ("{}") carried in a POST
request to the challenge URL (not the authorization URL).
[...]
Usually, the validation process will take some time, so the client
will need to poll the authorization resource to see when it is
finalized. For challenges where the client can tell when the server
has validated the challenge (e.g., by seeing an HTTP or DNS request
from the server), the client SHOULD NOT begin polling until it has
seen the validation request from the server.To check on the status of an authorization, the client sends a POST-
as-GET request to the authorization URL, and the server responds with
the current authorization object. In responding to poll requests
while the validation is still in progress, the server MUST return a
200 (OK) response and MAY include a Retry-After header field to
suggest a polling interval to the client.
Steps to reproduce the bug:
The bug occurs in communication with the Nexus-CM ACME interface. A publicly available instance could be setup if requested.
Anything else we need to know?:
Nexus-CM ACME interface works with certbot.
HTTP Request-Response logging by the Nexus-CM ACME interface:
Request-Respone.txt
Cert-manager logs with log level 6:
cert-manager-cm.log
Any additional logs required? Then please let me know.
Is it possible to get cert-manager to log HTTP requests and target URLs in some way?
Environment details:
/kind bug
I think the error lies in the method:
pkg/controller/acmechallenges/sync.go::syncChallengeStatus()
Where you correctly see if a Challenge URL has been given since there is no point in checking challenge status until that time.
But then you continue using the Challenge URL to verify the challenge status when at least according to our interpretation of the ACME RFC the authorization URL should be used to query the challenge status.
If you believe that the Challenge URL is correctly used here, then how does the 'Calling GetChallenge' and 'Calling Accept' differ? Because to us, it looks like these two calls are sending the same type of request, how is the server suppose to tell if it is either a status check or telling the server it is ready for the challenge validation to begin?
-- Challenge Accept/Ready Request (This request is sent before the resolver is setup)
{
"protected": {
" alg ": " RS256 ",
" kid ": " https://<host>:<port>/pgwy/acme/directory/account/ge-iZ53JXCToULZzfW_33g",
"nonce": "DZo6qEZQzx8-uVqfQ2u64Q",
"url": "https://<host>:<port>/pgwy/acme/directory/orders/vMBi3K53i7BDzbPTaEyxmg/authz/1fSgdxj8RZk9bATMpmH-5w/http-01"
},
"payload": "",
"signature": "N7qh-DrvDUujFIJ64sE7VDvpsuN5kOU-4a5UPp0NEmFEmK3_uL75PubeufpgPfX6PK_oYWWrA34aFbI3yp1lpd_GWBMSIaiLfPTVhRc-Xk765tm3fwwimDb5FeDGXaB5exWu2yZU-WGz2aMySdLxgMAB1qrMEIJZBkDk1840VJP6MwURnVO8RNAeCJMEzmpYWxgVmmJPxm-Phwj9IMYPY3Kls2H-7Jq1DGh_b1WDdRBWBkxBdVu5Z1zZhX3R0_UC8Sjb64A1a2nt8rAbWEYh8vrGmP47LlnRIGnH-eqWzua7s54RjVCmB6jWoU6HI6jlva5s8-ITriGGaisU2phlBg"
}
This isn't a challenge accept request as the payload is empty, this should be interpreted as request to get the status of the challenge: https://tools.ietf.org/html/rfc8555#section-6.3 CM will query this several times as it prepares to solve the challenge, and then when it is ready it will call again with a payload of "{}": https://tools.ietf.org/html/rfc8555#section-7.5.1
I am not sure if this is the cause of the problems, but it seems like it could be contributing
to it.
Thanks for the reply @james-w
That is the cause of the problem. I missed https://tools.ietf.org/html/rfc8555#section-6.3 when looking through the RFC.
However, I do find section 6.3 to be quite vague if it should be available for all endpoints or not. Reading through section 7.5.1 it is quite clear to me that the authorization URL should be used to poll the status of the challenge objects. While there is no reference to polling the challenge URL. It looks like it might be allowed, if section 6.3 "Post-as-Get" requests SHOULD/MUST be available for all endpoints. But it does not look to be the intended way to poll the challenge status.
Thoughts? @munnerz
My reading of 6.3 is that every endpoint should accept POST-as-GET for every endpoint, and
reject plain GET requests for every endpoint except for directory and newNonce.
Section 7.5.1 is clear that the authoriztion URL should be used to poll the status. However
at that point we are polling for the status of the authorization, not the specific challenge.
Once the client has reported that it is ready to solve a challenge all it really cares about
is whether the authorization is finalized, and that is slightly different to the status of a
specific challenge.
I believe that CM does correctly poll the authorization endpoint when it is waiting for
the validation to be finalized (after setting up the solver and sending the request to
tell the server that it is ready).
However, the requests you are seeing are before that point, while CM is still gathering
the info about the challenge and setting up the solver, where it is using the challenge
endpoint. The spec doesn't seem to have anything to say on whether that is valid
in that section of the spec, and 7.1.5 doesn't talk about the endpoint.
But it does not look to be the intended way to poll the challenge status.
The spec doesn't really deal with polling the challenge status, as once you
are ready to solve the challenge you only poll the authorization status, which
implicitly includes the challenge status.
These requests aren't polling the status though, merely collecting the challenge
data so that the solver can be created, and that seems like a valid use of the
challenge endpoint.
Thanks for the help @james-w.
We have gotten it to work with jetstack cert-manager by enabling support for POST-as-GET on the Challenge URL and allowing redirects when validating the challenge token in instances where it redirects from HTTP to HTTPS as allowed in section 8.3 of the RFC.
However, the requests you are seeing are before that point, while CM is still gathering
the info about the challenge and setting up the solver, where it is using the challenge
endpoint.
I still suggest that you should investigate the necessity of this 'additional' information gathering. All information required to setup the solver should have been provided when querying the authorization URL for available challenges. Since the authorization object (section 7.1.4) contains the challenge object information already. To me it appears that the challenge URL POST-as-GET queries are unnecessary overhead.
If you disagree, it would be fun to know what critical information you are gaining there.
Anyway, it works now, you can close the ticket.
I have been looking into the RFC more deeper and in https://tools.ietf.org/html/rfc8555#section-8.2
The server MUST provide information about its retry state to the
client via the "error" field in the challenge and the Retry-After
HTTP header field in response to requests to the challenge resource.
The Retry-After HTTP header could only be set on a POST-as-GET call on the challenge resource.
Anyway as you asked I am going to close this
/close
@meyskens: Closing this issue.
In response to this:
I have been looking into the RFC more deeper and in https://tools.ietf.org/html/rfc8555#section-8.2
The server MUST provide information about its retry state to the
client via the "error" field in the challenge and the Retry-After
HTTP header field in response to requests to the challenge resource.The
Retry-After HTTP headercould only be set on a POST-as-GET call on the challenge resource.Anyway as you asked I am going to close this
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I still suggest that you should investigate the necessity of this 'additional' information gathering. All information required to setup the solver should have been provided when querying the authorization URL for available challenges. Since the authorization object (section 7.1.4) contains the challenge object information already. To me it appears that the challenge URL POST-as-GET queries are unnecessary overhead.
I am not sure on this point, but I think this is because we have multiple co-operating controllers that handle the challenge, and the requests is each of them fetching the info as we only store the URL for the challenge and not the whole challenge object.
There is a bit of a refactor planned in this area, so it all may well be changing soon.
Most helpful comment
This isn't a challenge accept request as the payload is empty, this should be interpreted as request to get the status of the challenge: https://tools.ietf.org/html/rfc8555#section-6.3 CM will query this several times as it prepares to solve the challenge, and then when it is ready it will call again with a payload of
"{}": https://tools.ietf.org/html/rfc8555#section-7.5.1I am not sure if this is the cause of the problems, but it seems like it could be contributing
to it.