Etcd: 3.2/3.3 etcd server with TLS would start with error "tls: bad certificate"

Created on 7 Mar 2018 · 21Comments · Source: etcd-io/etcd

While debugging issues (might be relevant):

I have found that a single member etcd server on bootstrap will show error:

2018-03-07 22:36:51.136699 I | embed: rejected connection from "127.0.0.1:35160" (error "tls: failed to verify client's certificate: x509: certificate specifies an incompatible key usage", ServerName "")
WARNING: 2018/03/07 22:36:51 Failed to dial 0.0.0.0:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.

Reproduce steps:
https://gist.github.com/hongchaodeng/7d62f3b5d30b58c783c382d9b629b819

Note that 3.1.11 didn't have this error log.

aretls stale

Source

hongchaodeng

👍1

Most helpful comment

I meet the same issue etcd version 3.3.1
logs shown as follow
Mar 09 13:37:33 master1 etcd[4197]: rejected connection from "192.168.9.186:31833" (error "remote error: tls: bad certificate", ServerName "")

itnikita on 9 Mar 2018

👍16

All 21 comments

I can reproduce with 3.2 and 3.3. Will take a look.

gyuho on 8 Mar 2018

This comment is relevant here: https://github.com/coreos/etcd-operator/pull/1727#issuecomment-370968658

hongchaodeng on 8 Mar 2018

itnikita on 9 Mar 2018

👍16

I have the same issue in 3.3.1

3月 13 12:38:47 172-20-24-117 etcd[9508]: rejected connection from "127.0.0.1:55480" (error "tls: failed to verify client's certificate: x509: certificate specifies an incompatible key usage", ServerName "")
3月 13 12:38:47 172-20-24-117 bash[9508]: WARNING: 2018/03/13 12:38:47 Failed to dial 0.0.0.0:4001: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.
3月 13 12:38:47 172-20-24-117 bash[9508]: WARNING: 2018/03/13 12:38:47 Failed to dial 0.0.0.0:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.
3月 13 12:38:47 172-20-24-117 etcd[9508]: rejected connection from "127.0.0.1:46640" (error "tls: failed to verify client's certificate: x509: certificate specifies an incompatible key usage", ServerName "")

workhardcc on 13 Mar 2018

Hello,
I've the same for 3.2.16

-- The start-up result is done. Mar 14 18:48:09 kubem01 etcd[3089]: WARNING: 2018/03/14 18:48:09 Failed to dial 10.101.0.81:2379: connection error: desc = "transport: authentication handshake failed: remote error Mar 14 18:48:09 kubem01 etcd[3089]: WARNING: 2018/03/14 18:48:09 Failed to dial 10.101.0.81:2379: connection error: desc = "transport: authentication handshake failed: remote error [root@kubem01 ~]#

bsctl on 14 Mar 2018

I am having the same issue after updating v3.2.11 to v3.3.3.

gintautassulskus on 17 Apr 2018

Seeing the same with v3.3.5 after debugging coreos/etcd-operator#1962.

tkellen on 21 May 2018

Resolved by adding client auth as an extended key usage in my cfssl config as recommended here (and evidently missing based on error output):
https://coreos.com/os/docs/latest/generate-self-signed-certificates.html

tkellen on 21 May 2018

@tkellen What have yopu exactly added to resolve this? I am using etcd version 3.2.17 and getting below error-
Failed to dial 0.0.0.0:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.

manyagoyal12 on 28 Sep 2018

I have the same issue in 3.3.9

pengyao on 9 Oct 2018

i have the same issue in 3.3.8 on openshift

uocxp on 17 Oct 2018

How to fix the issue: https://github.com/etcd-io/etcd/issues/9785#issuecomment-432438748 (add "client auth" to "server" profile in CA config and regenerate server cert).

KIVagant on 24 Oct 2018

👍1

How to fix the issue: #9785 (comment) (add "client auth" to "server" profile in CA config and regenerate server cert).

Not working for me.
CA

cat csr_ROOT_CA.json 
{
 "CN": "dev",
 "key": {
    "algo": "rsa",
    "size": 4096
 },
 "names": [
 {
    "C": "RU",
    "L": "Sauronsk",
    "O": "kubernetes",
    "ST": "Mordor"
 }
 ],
 "ca": {
    "expiry": "8760h"
 }
}

Generate

cfssl gencert -initca csr_ROOT_CA.json | cfssljson -bare root_ca

Intermediate for etcd

cat csr_INTERMEDIATE_CA.json
{
 "CN": "etcd.dev",
 "key": {
    "algo": "rsa",
    "size": 4096
 },
 "names": [
 {
    "C": "RU",
    "L": "Sauronsk",
    "O": "kubernetes",
    "ST": "Mordor"
 }
 ],
 "ca": {
    "expiry": "8760h"
 }
}

Generating intermediate

cfssl gencert -initca csr_INTERMEDIATE_CA.json | cfssljson -bare intermediate_ca

Intermediate sign config

cat root_to_intermediate_ca.json
{ 
"signing": {
 "default": {
 "usages": ["digital signature","cert sign","crl sign","signing"],
 "expiry": "8760h",
 "ca_constraint": {"is_ca": true, "max_path_len":0, "max_path_len_zero": true}
 }
 }

Sign intermediate

cfssl sign -ca ../root_ca.pem -ca-key ../root_ca-key.pem -config root_to_intermediate_ca.json intermediate_ca.csr | cfssljson -bare intermediate_ca

Etcd certificate config

cat csr_END_CA.json
{
 "CN": "etcd.kub1.cloud",
 "hosts": [
   "10.10.10.101", "etcd.kub1.cloud"
 ],
 "key": {
    "algo": "rsa",
    "size": 4096
 },
 "names": [
 {
    "C": "RU",
    "L": "Sauronsk",
    "O": "kubernetes",
    "ST": "Mordor"
 }
 ]
}

Intermediate to end config

cat intermediate_to_end.json                                                   
{
  "signing": {
    "default": {
      "expiry": "8760h"
    },
    "profiles": {
      "server": {
        "expiry": "8760h",
        "usages": [
          "signing",
          "key encipherment",
          "server auth"
        ]
      },
      "client": {
        "expiry": "8760h",
        "usages": [
          "signing",
          "key encipherment",
          "client auth"
        ]
      },
      "client-server": {
        "expiry": "8760h",
        "usages": [
          "signing",
          "key encipherment",
          "server auth",
          "client auth"
        ]
      }
    }
  }

Generating etcd config

cfssl gencert -ca ../intermediate_ca.pem -ca-key ../intermediate_ca-key.pem -config intermediate_to_end.json -profile=client-server csr_END_CA.json | cfssljson -bare etcd

And starting etcd like

cat /etc/systemd/system/etcd.service 
[Unit]
Description=etcd
Documentation=https://github.com/coreos/etcd

[Service]
User=etcd
Type=notify
Restart=always
RestartSec=5s
LimitNOFILE=40000
TimeoutStartSec=0

ExecStart=/usr/local/bin/etcd --name kub1_etcd \
    --data-dir /var/lib/etcd \
    --client-cert-auth \
    --trusted-ca-file /etc/ssl/certs/etcd/ca.pem \
    --cert-file /etc/ssl/certs/etcd/etcd.pem \
    --key-file /etc/ssl/certs/etcd/etcd-key.pem \
    --peer-client-cert-auth \
    --peer-trusted-ca-file /etc/ssl/certs/etcd/ca.pem \
    --peer-cert-file /etc/ssl/certs/etcd/etcd.pem \
    --peer-key-file /etc/ssl/certs/etcd/etcd-key.pem \
    --listen-client-urls https://10.10.10.101:2379 \
    --advertise-client-urls https://10.10.10.101:2379 \
    --listen-peer-urls https://10.10.10.101:2380 \
    --initial-advertise-peer-urls https://10.10.10.101:2380 \
    --initial-cluster kub1_etcd=https://10.10.10.101:2380,kub2_etcd=https://10.10.10.104:2380 \
    --initial-cluster-token my-etcd-token \
    --initial-cluster-state new

[Install]
WantedBy=multi-user.target

Same config for second node (10.10.10.104)

Erorr log still the same ((

rejected connection from "10.10.10.104:43816" (error "remote error: tls: bad certificate", ServerName "")

nejtr0n on 2 Nov 2018

/assign

hexfusion on 2 Nov 2018

@nejtr0n , I don't see "client auth" permission in your config for "server" key:

      "server": {
        "expiry": "8760h",
        "usages": [
          "signing",
          "key encipherment",
          "server auth"
        ]
      },

How to fix the issue: #9785 (comment) (add "client auth" to "server" profile in CA config and regenerate server cert).

KIVagant on 9 Nov 2018

👍4

I reran ansible scripts one more time, and etcd is up with 3.3 version. Dunno how it was solved... Do not change nothng.

nejtr0n on 13 Nov 2018

I didn't have this problem until I upgraded to 3.4. I think the golang upgrade is the cause but if y'all were having problems with 3.3 then I don't think my issues are the same as everyone's here.

2rs2ts on 17 Oct 2019

This is still a problem with no clarity in the documentation for 3.4.

BongoEADGC6 on 10 Dec 2019

👍6

I have the same issue in 3.4.3

mac119 on 22 Jan 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

stale[bot] on 21 Apr 2020

I had the same issue in v3.4.9.
I resolved it by adding clientAuth (TLS Web Client Authentication) to ETCD server certificate (used in ETCD_CERT_FILE).
I'm not sure if, and why a server should have clientAuth flag in its certificate...