Crystal: TCPSocket not seeing docker network?

Created on 6 Apr 2016  路  11Comments  路  Source: crystal-lang/crystal

This is a tricky one to report and I'm not sure what all details will help track down the root of the issue, so let me know what I can do to help.

Running the following code against the latest Docker for Mac beta (which may, in itself be part of the problem):

require "http/client"

HTTP::Client.new("docker.local", 8001, ssl: false) do |client|
  puts client.host, client.port, client.ssl?

  res = client.get("/")
  puts res.body
end

Returns the following

> cr test.cr
docker.local
8001
false
getaddrinfo: nodename nor servname provided, or not known (Socket::Error)
[4411973602] *CallStack::unwind:Array(Pointer(Void)) +82
[4411973505] *CallStack#initialize<CallStack>:Array(Pointer(Void)) +17
[4411973464] *CallStack::new:CallStack +40
[4412147673] *Socket::Error@Exception#initialize<Socket::Error, String, Nil>:CallStack +41
[4412147604] *Socket::Error::new<String>:Socket::Error +100
[4412141451] *TCPSocket#initialize<TCPSocket, String, Int32, Float64?, Float64?>:Nil +1835
[4412139581] *TCPSocket::new<String, Int32, Float64?, Float64?>:TCPSocket +221
[4412131582] *HTTP::Client#socket<HTTP::Client>:(OpenSSL::SSL::Socket | TCPSocket+) +110
[4412131035] *HTTP::Client#exec_internal<HTTP::Client, HTTP::Request>:HTTP::Client::Response +43
[4412130962] *HTTP::Client#exec<HTTP::Client, HTTP::Request>:HTTP::Client::Response +34
[4412130445] *HTTP::Client#exec<HTTP::Client, String, String, Nil, Nil>:HTTP::Client::Response +29
[4412130407] *HTTP::Client#get<HTTP::Client, String>:HTTP::Client::Response +39
[4411933302] __crystal_main +37926
[4411943976] main +40

However, using Ruby (which I believe to be based off of the same C socket library, performs correctly:

require 'net/http'

Net::HTTP.get 'docker.local', "/", 8001

Which prints out the request body of the API as I expect (this was run in the REPL).

If I swap the docker.local hostname for the IP the hostname resolves to, it works properly. However I would expect this not to be an issue. Anything I can give you to help chase this down?

bug stdlib

Most helpful comment

Hosts in the .local domain are not resolved through DNS. Instead multicast DNS (mDNS) is used, and it's implemented by Bonjour in OSX. Libevent doesn't use these services and that's why Crystal is currently unable to connect with .local hosts.

I was recently playing around the idea of stop using the name resolution from libevent, and use native services provided on each platform instead. For example, in Linux there is getaddrinfo_a which does exactly what we need. In OSX, there are some Core Foundation APIs that we can use to use the same name resolution services that every app uses. It requires spawning another thread to execute the CF run loop, but that shouldn't be an issue.

All 11 comments

If I remember correctly, based on what @waj told me (but I might be wrong), we are using libevent for network-related stuff and libevent doesn't see these addresses, but the C functions that come with mac do.

So basically that's the reason: Ruby uses the C library, we are not (but we should). But that means we need to replicate all of what libevent does. I guess this will come in the future but not right now.

I can confirm: DNS resolution are blocking calls, and as such can potentially block the event loop for a long time, which is bad, obviously. We thus rely on libevent's async DNS, which is evented, but uses its own custom resolver that reimplements getaddrinfo instead of C syscalls: http://www.wangafu.net/~nickm/libevent-book/Ref9_dns.html

Problem is, it's kind of limited.

That being said, how does docker resolver works? Does it set anything into the /etc/resolv.conf file? Maybe initializing the dns base manually with evdns_base_resolv_conf_parse and DNS_OPTIONS_ALL would help?

Docker for Mac beta does not set anything in the /etc/resolv.conf file. It probably routes it through the internal network stack?

https://beta.docker.com/docs/mac/experiment/#user-space-networking-vpn-hostnet

That link may require a docker login.

Hosts in the .local domain are not resolved through DNS. Instead multicast DNS (mDNS) is used, and it's implemented by Bonjour in OSX. Libevent doesn't use these services and that's why Crystal is currently unable to connect with .local hosts.

I was recently playing around the idea of stop using the name resolution from libevent, and use native services provided on each platform instead. For example, in Linux there is getaddrinfo_a which does exactly what we need. In OSX, there are some Core Foundation APIs that we can use to use the same name resolution services that every app uses. It requires spawning another thread to execute the CF run loop, but that shouldn't be an issue.

Hit this trying to connecto to postgres inside a docker network. It's not quite the same network because it's not a .local address, but the resolve fails regardless. Here is what I did to work around:

conninfo = PQ::ConnInfo.from_conninfo_string(ENV["DATABASE_URL"])
host = `getent hosts #{conninfo.host} | awk '{ printf $1 }'`
conninfo = PQ::ConnInfo.new(host, conninfo.database, conninfo.user, conninfo.password, conninfo.port, conninfo.sslmode)
PG.connect(conninfo)

Turns out shelling out was the easiest solution...

I want to just poke at the idea of possibly re-evaluating abstraction-and-events framework when the concurrency discussions phase starts. I'm still not to sure about libevent personally, there are many options.

I looked around to see what node/libuv and go do to avoid the problem:

Closing in favor of #2660

Was this page helpful?
0 / 5 - 0 ratings