Grpc-go: Why addr.Authority not addr.Addr? bug or feature?

Created on 27 Nov 2018  Â·  6Comments  Â·  Source: grpc/grpc-go

Please answer these questions before submitting your issue.

What version of gRPC are you using?

v1.16.0

What version of Go are you using (go version)?

go version go1.11.2 linux/amd64

What operating system (Linux, Windows, …) and version?

linux

What did you do?

If possible, provide a recipe for reproducing the error.

What did you expect to see?

code works well.

What did you see instead?

WARNING: 2018/11/27 22:36:41 grpc: addrConn.createTransport failed to connect to {10.1.230.47:50051 0 helloworld.Hello <nil>}. Err :connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for 10.1.230.47, not 127.0.0.1". Reconnecting...

I find in http2_client.go, use addr.Authority to Handshake which is 127.0.0.1:8500(my consul address). Why not the real service address, addr.Addr?

Most helpful comment

The certificate should (normally) be signed for the server name, because

  1. the backend IPs can change, but the server name usually don't change

    • this is the same reason that you want to hide the server addresses, and only expose consul address

  2. you can use the same certificate for all your backend IPs, instead of one for each backend

A name resolution system can be seen as a map from server names to IPs (like a map[string][]IP).

If I understand correctly, in your system, a consul server is responsible for a bunch of backend servers. When dialing, you connect to this consul server, get all the backend server IPs registered in the consul server, and connect to the backends.
This makes it a map from consul-server-IP to backend-server-IPs (map[consul-IP][]backend-IP).

So the consul-server-IP is essentially the name in the name resolution system, and since the certificate should be signed for the name, the certificate should be signed for the consul-IP.

(Some other suggestion: instead of one consul server for one group of backend servers with the same name, the consul server can be used to manage multiple server names. So it looks more like a real name resolver (like DNS).
When you register backend servers to the consul server, it can be registered with a name (e.g. service.hxzhao527.com). The certificate can be signed for the same name (service.hxzhao527.com). On the client side, dial to consul://127.0.0.1:5000/service.hxzhao527.com.)

All 6 comments

Authority is the server name, not the IP (and a name could be resolved to multiple IPs).
The client side credentials, for example TLS, needs to verify that the name is covered by the certificates returned from the server.

Also, if you need the IP, you should be able to get it from net.Conn.

For the error,

certificate is valid for 10.1.230.47, not 127.0.0.1

Server certificate is for 10.1.230.47, but you are doing grpc.Dial(127.0.0.1). So the hostname verification failed.

If you cannot do grpc.Dial(10.1.230.47), you can override the server name when creating client credentials. Note this should only be used for tests, not for production.

my certificate is sign for 10.1.230.47(my own local ip), grpc-server run with 10.1.230.47:50051.
Calling server directly

conn, err := grpc.Dial("10.1.230.47:50051", opts...)
...

works well.
when integrated with consul which run on 127.0.0.1:5000, rpc-call failed.

conn, err := grpc.Dial("consul://pass/127.0.0.1:5000", opts...)
...

there is my code: grpcdemo.

I'm not very familiar with consul or your setup. But an easy fix for you:

conn, err := grpc.Dial("consul://127.0.0.1:5000/10.1.230.47:50051", opts...

And in your consul name resolver (line conf.Address = target.Endpoint), change it to

conf.Address = target.Authority

You can read more about the name syntax here: https://github.com/grpc/grpc/blob/master/doc/naming.md#name-syntax

I have tried conn, err := grpc.Dial("consul://127.0.0.1:5000/10.1.230.47:50051", opts...). It did work.

But I still have some questions.

  1. My certificate is signed for ip(or many ip), no DNS required. In this case, what the Authority shoud be?
  2. 10.1.230.47:50051 is my server address. The purpose of consul is to hide the address of server. Client only need to know which service it will call and where it can get the server address from with the service name. So I can't pass the server-address, even server-name, to grpc.Dial. The address of consul is the only one parameter I can provide, and I wish I can.

The certificate should (normally) be signed for the server name, because

  1. the backend IPs can change, but the server name usually don't change

    • this is the same reason that you want to hide the server addresses, and only expose consul address

  2. you can use the same certificate for all your backend IPs, instead of one for each backend

A name resolution system can be seen as a map from server names to IPs (like a map[string][]IP).

If I understand correctly, in your system, a consul server is responsible for a bunch of backend servers. When dialing, you connect to this consul server, get all the backend server IPs registered in the consul server, and connect to the backends.
This makes it a map from consul-server-IP to backend-server-IPs (map[consul-IP][]backend-IP).

So the consul-server-IP is essentially the name in the name resolution system, and since the certificate should be signed for the name, the certificate should be signed for the consul-IP.

(Some other suggestion: instead of one consul server for one group of backend servers with the same name, the consul server can be used to manage multiple server names. So it looks more like a real name resolver (like DNS).
When you register backend servers to the consul server, it can be registered with a name (e.g. service.hxzhao527.com). The certificate can be signed for the same name (service.hxzhao527.com). On the client side, dial to consul://127.0.0.1:5000/service.hxzhao527.com.)

Was this page helpful?
0 / 5 - 0 ratings