Caddy: error 403 for SLD with dash

Created on 24 Apr 2018 · 11Comments · Source: caddyserver/caddy

1. What version of Caddy are you using (`caddy -version`)?

0.10.14

2. What are you trying to do?

Curl domain with dash via HTTPS thus initiate certificate generation.

3. What is your entire Caddyfile?

https://exam-ple.com {
  tls (email) {
    max_certs 10
    key_type rsa4096
  }
  redir / {scheme}://www.{host}{uri} 301
}

4. How did you run Caddy (give the full command and describe the execution environment)?

On Ubuntu using the init.d script:
/usr/local/bin/caddy -agree=true -log=/var/log/caddy.log -conf=/etc/caddy/Caddyfile -disable-tls-sni-challenge

5. Please paste any relevant HTTP request(s) here.

curl -I https://exam-ple.com

6. What did you expect to see?

Certificate is generated and delivered and a redirect (301) is given to the requester.

7. What did you see instead (give full error messages and/or log)?

Log entry for request that fails to generate the certificate (domain was replaced):

2018/04/24 09:34:58 [INFO] Obtaining new certificate for www.best-price.com
2018/04/24 09:34:59 [INFO][exam-ple.com] acme: Obtaining bundled SAN certificate
2018/04/24 09:34:59 [INFO][exam-ple.com] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz/tpkQwBaci8xACW50K8WQp5PARMt9vnK46mMLdGe-xqM
2018/04/24 09:34:59 [INFO][exam-ple.com] acme: Trying to solve HTTP-01
2018/04/24 09:35:05 http: TLS handshake error from 123.30.175.181:40779: [exam-ple.com] failed to get certificate: acme: Error 403 - urn:ietf:params:acme:error:unauthorized - Invalid response from http://exam-ple.com/.well-known/acme-challenge/o96GjZP6dXNGnzcV8kEs30pjaDEhBzb2aSIVRO2vSz4: "<!DOCTYPE html><html>
    <head>
        <base href="/"/>
        <title>exam-ple.com</title>                        <meta htt"

Log of response from curl:
curl: (35) error:14094438:SSL routines:ssl3_read_bytes:tlsv1 alert internal error

Domains that don't follow the schema 'exam-ple.com' (with dash) like 'example.com' are working as expected.

8. How can someone who is starting from scratch reproduce the bug as minimally as possible?

install caddy 0.10.14
configure like the example in (3)
request domain with dash in second level pointing to caddy
check log

Source

root360-AndreasUlm

Most helpful comment

So now I finally found the reason for this behaviour and IMHO found a regression concerning caddy's log mechanism.

We had two domains that produced the same error message in caddy.log (quoted above).
The reasons were different.
Domain1 just ran into limits of Let's encrypt and now could generate a valid certificate.
Domain2 has a faulty DNS configuration.

I found the real cause by downgrading caddy to version 0.10.10.
This version now logged the following:

2018/04/25 16:48:30 [INFO] Obtaining new certificate for exam-ple.com
2018/04/25 16:48:31 [INFO][exam-ple.com] acme: Obtaining bundled SAN certificate
2018/04/25 16:48:31 [INFO][exam-ple.com] AuthURL: https://acme-v01.api.letsencrypt.org/acme/authz/<auth-code>
2018/04/25 16:48:31 [INFO][exam-ple.com] acme: Trying to solve HTTP-01
2018/04/25 16:48:34 http: TLS handshake error from 178.24.29.62:33932: [exam-ple.com] failed to get certificate: acme: Error 403 - urn:acme:error:unauthorized - Invalid response from http://exam-ple.com/.well-known/acme-challenge/vGT8OnHmcAntVLH__Ibr2dnN6AuJwp1AxTHVmFtbOog: "<!DOCTYPE html>

<!--// OPEN HTML //-->
<html lang="de-DE">

        <!--// OPEN HEAD //-->
        <head>

                <!-- Manually set render engin"
Error Detail:
        Validation for www.exam-other.com:80
        Resolved to:
                <IPv4-1>
                <IPv6-1>
        Used: <IPv4-1>

        Validation for exam-ple.com:80
        Resolved to:
                <IPv4-2>
                <IPv6-1>
        Used: <IPv6-1>

The important difference of the log message is the part "Error Detail" which pointed me to the fact that 'exam-ple.com' has an IPv4 and an IPv6 address while the IPv6 address is pointing to 'exam-other.com'.
As this part shows the servers of Let's Encrypt used the IPv6 address to validate the request.
That failed as the request wasn't responded by caddy.

This example also shows that caddy version 0.10.12+ (didn't tested 0.10.11) logs different messages than 0.10.10 did resulting in to few information to find the cause of an issue.

@mholt can you fix the IMHO regression of the logging?

root360-AndreasUlm on 25 Apr 2018

👍2

All 11 comments

Does it happen if you take out the redir directive? If so, does it happen if you disable on-demand TLS? (Why are you using on-demand TLS anyway?)

mholt on 24 Apr 2018

Without the redir directive the behaviour is the same.
When disabling on-demand TLS the solver takes several minutes without success and caddy never comes up because it fails. The log messages in this case are:

2018/04/24 16:48:16 [INFO][exam-ple.com] acme: Obtaining bundled SAN certificate
2018/04/24 16:48:16 [INFO][exam-ple.com] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz/<auth-token>
2018/04/24 16:48:16 [INFO][exam-ple.com] acme: Could not find solver for: dns-01
2018/04/24 16:48:16 [INFO][exam-ple.com] acme: Trying to solve HTTP-01

We are using on-demand TLS because our caddy handles 1500+ known domains and several unknown customer controlled domains.
If we wouldn't use on-demand the startup of caddy would take to long.
According to the test with disabled on-demand TLS caddy even wouldn't start as the certificate generation process doesn't succeed.
Also on-demand TLS is the only reason why we are using caddy.

root360-AndreasUlm on 24 Apr 2018

Does the log start and stop there? That seems abnormally truncated for a Caddy log.

Also, you should use the staging endpoint when testing...

mholt on 24 Apr 2018

Those logs are all caddy writes concerning the issue.
Everything else are logs about successfully generating certificates and loading existing certificates.

I cannot use the testing endpoint as I'm not testing but just adding a new domain to our production caddy.

Is there any configuration to increase the number of bytes logged from the response of letsencrypt?

root360-AndreasUlm on 24 Apr 2018

@root360-AndreasUlm, some related questions:

Is it the only domain having this issue?
Is it the only domain name with a dash?
How is the DNS configured for this domain?
If you disable TLS for that domain name (with "TLS off" in that block) is the domain name reachable?
Are you 100% sure that the "dash" is a "dash" and not an "em dash"?

magikstm on 24 Apr 2018

👍1

It seems the domain has an A record that points to 52.29.127.0 -- is that where your Caddy instance is?

The fact that it stops at "trying to solve HTTP-01" is making me suspicious about the domain's DNS configuration.

EDIT: I just tried getting a certificate for a domain name with a hyphen (that has a proper DNS configuration) and it worked fine.

mholt on 25 Apr 2018

Incidentally, that's an Amazon IP address, and there's a _slim_ chance you were affected by this: https://doublepulsar.com/hijack-of-amazons-internet-domain-service-used-to-reroute-web-traffic-for-two-hours-unnoticed-3a6f0dda6a6f - I'm not exactly sure how, but _if_ everything else _was_ properly configured, it's possible there was some other anomaly...

mholt on 25 Apr 2018

I visited your domain in my browser, and it's definitely an Apache server that responded, indicating that your DNS or infrastructure isn't even pointing to your Caddy instance.

I will close the issue for now, unless it can be confirmed that it's a bug in Caddy. Let us know what you find out!

mholt on 25 Apr 2018

So now I finally found the reason for this behaviour and IMHO found a regression concerning caddy's log mechanism.

I found the real cause by downgrading caddy to version 0.10.10.
This version now logged the following:

2018/04/25 16:48:30 [INFO] Obtaining new certificate for exam-ple.com
2018/04/25 16:48:31 [INFO][exam-ple.com] acme: Obtaining bundled SAN certificate
2018/04/25 16:48:31 [INFO][exam-ple.com] AuthURL: https://acme-v01.api.letsencrypt.org/acme/authz/<auth-code>
2018/04/25 16:48:31 [INFO][exam-ple.com] acme: Trying to solve HTTP-01
2018/04/25 16:48:34 http: TLS handshake error from 178.24.29.62:33932: [exam-ple.com] failed to get certificate: acme: Error 403 - urn:acme:error:unauthorized - Invalid response from http://exam-ple.com/.well-known/acme-challenge/vGT8OnHmcAntVLH__Ibr2dnN6AuJwp1AxTHVmFtbOog: "<!DOCTYPE html>

<!--// OPEN HTML //-->
<html lang="de-DE">

        <!--// OPEN HEAD //-->
        <head>

                <!-- Manually set render engin"
Error Detail:
        Validation for www.exam-other.com:80
        Resolved to:
                <IPv4-1>
                <IPv6-1>
        Used: <IPv4-1>

        Validation for exam-ple.com:80
        Resolved to:
                <IPv4-2>
                <IPv6-1>
        Used: <IPv6-1>

This example also shows that caddy version 0.10.12+ (didn't tested 0.10.11) logs different messages than 0.10.10 did resulting in to few information to find the cause of an issue.

@mholt can you fix the IMHO regression of the logging?

root360-AndreasUlm on 25 Apr 2018

👍2

I think the change is in the upstream xenolf/lego package -- the acmev2 branch may not report that error, if the ACMEv2 endpoint itself is programmed with different error messages. This is a good question for @xenolf and perhaps should be an issue made upstream: https://github.com/xenolf/lego.

mholt on 25 Apr 2018

The error message changed from v1 to v2. The information that is shown is all we're getting from LE.