When attempting to send a request with http/client today inside of Docker, the OpenVPN Client sets a route of 0.0.0.0/0, this breaks connections to Docker DNS 127.0.0.11, when making a request to a host, I receive the error message: (ERROR) Socket::Addrinfo::Error getaddrinfo: Temporary failure in name resolution /usr/share/crystal/src/socket/tcp_socket.cr:75:15 in 'initialize', after a while I realized that this error only happened in containers where OpenVPN was running.
For those curious, an OpenVPN Client inside the Docker contain allows our developers to copy production data into local host for testing how of code works against our production databases. For security reasons our MySQL and PostgreSQL instances are not exposed to the public internet and require users to be connected to our internal VPN. This was tricky to debug until we realized production wasn't affected for some reason and discovered that OpenVPN was the culprit.
It would be helpful to my efforts if the error message included something like "The DNS Resolver x.x.x.x is not responding, check your system configuration.", it would of saved many hours. I consider myself a noob when it comes to networking and can imagine this might confuse other users too. Thoughts?
Error trace
08:48:40 Error: 500 | (ERROR) Socket::Addrinfo::Error getaddrinfo: Temporary failure in name resolution /usr/share/crystal/src/socket/tcp_socket.cr:75:15 in 'initialize'
/usr/share/crystal/src/socket/tcp_socket.cr:27:3 in 'new'
/usr/share/crystal/src/http/client.cr:755:5 in 'socket'
/usr/share/crystal/src/http/client.cr:643:19 in 'send_request'
/usr/share/crystal/src/http/client.cr:580:5 in 'exec_internal_single'
/usr/share/crystal/src/http/client.cr:567:5 in 'exec_internal'
/usr/share/crystal/src/http/client.cr:563:5 in 'exec'
/usr/share/crystal/src/http/client.cr:685:5 in 'exec'
/usr/share/crystal/src/http/client.cr:389:3 in 'post:body'
lib/CrystalProxmox/src/access/access.cr:13:5 in 'create_ticket'
lib/CrystalProxmox/src/CrystalProxmox.cr:21:22 in 'initialize'
lib/CrystalProxmox/src/CrystalProxmox.cr:13:5 in 'new'
src/models/virtual_machine.cr:159:12 in 'proxmox_connect'
src/models/virtual_machine.cr:163:23 in 'start_vm'
src/controllers/virtual_machine_controller.cr:19:41 in 'start'
config/routes.cr:84:9 in '->'
lib/amber/src/amber/router/route.cr:255:3 in 'call'
lib/amber/src/amber/router/context.cr:78:5 in 'process_request'
lib/amber/src/amber/pipes/controller.cr:8:9 in 'call'
/usr/share/crystal/src/http/server/handler.cr:26:7 in 'call_next'
src/pipes/user_authenticate.cr:15:11 in 'call'
/usr/share/crystal/src/http/server/handler.cr:26:7 in 'call_next'
lib/amber/src/amber/pipes/csrf.cr:18:11 in 'call'
/usr/share/crystal/src/http/server/handler.cr:26:7 in 'call_next'
lib/amber/src/amber/pipes/flash.cr:7:9 in 'call'
/usr/share/crystal/src/http/server/handler.cr:26:7 in 'call_next'
lib/amber/src/amber/pipes/session.cr:8:9 in 'call'
/usr/share/crystal/src/http/server/handler.cr:26:7 in 'call_next'
lib/amber/src/amber/pipes/logger.cr:15:9 in 'call'
/usr/share/crystal/src/http/server/handler.cr:26:7 in 'call_next'
lib/amber/src/amber/pipes/error.cr:9:9 in 'call'
/usr/share/crystal/src/http/server/handler.cr:26:7 in 'call_next'
lib/citrine-i18n/src/citrine-i18n/handler.cr:16:9 in 'call'
/usr/share/crystal/src/http/server/handler.cr:26:7 in 'call_next'
lib/amber/src/amber/pipes/client_ip.cr:26:9 in 'call'
/usr/share/crystal/src/http/server/handler.cr:26:7 in 'call_next'
lib/amber/src/amber/pipes/powered_by_amber.cr:6:9 in 'call'
lib/amber/src/amber/pipes/pipeline.cr:20:11 in 'call'
/usr/share/crystal/src/http/server/request_processor.cr:48:11 in 'process'
/usr/share/crystal/src/http/server/request_processor.cr:22:3 in 'process'
/usr/share/crystal/src/http/server.cr:497:5 in 'handle_client'
/usr/share/crystal/src/http/server.cr:463:13 in '->'
/usr/share/crystal/src/fiber.cr:255:3 in 'run'
/usr/share/crystal/src/fiber.cr:48:34 in '->'
???
I guess that if the message could be expanded with the host information available so far it would be an improvement. But since there could probably be a number of reasons for an unexpected name resolution I would not put probable causes in the error message. Only data.
@bcardiff Data such as connection to 127.0.0.11 timed out would be helpful to include.
I agree with that. But not with something like "check your system configuration" since it might not be a miss-configuration.
Fair and that might error be a bit too vague anyways. The more data on what went wrong the better.
So, I looked into the code and that particular exception message is just passed up from the POSIX getaddrinfo method (a C bound method) and not Crystal code. Unfortunately, it appears that's about as good as you can get since the actual return from getaddroinfo for your example would likely have been
EAI_AGAIN
The name server returned a temporary failure indication. Try
again later.
per the docs (http://man7.org/linux/man-pages/man3/getaddrinfo.3.html). This was just a long way of saying I don't think what you're asking for is going to be possible without using a different method for name resolution, which would just make things messier than shunting down to getaddrinfo.
OK with this diff https://gist.github.com/rdp/26f5cde11e8e886b71df5bd42a8730d0
running this:
require "socket"
client = TCPSocket.new("localhosttt", 1234)
results in this message:
Unhandled exception: No address found for localhosttt:1234 over TCP when attempting a DNS lookup (Socket::Addrinfo::Error)
Is that better? Accurate? :)
That’s much more clear. Can you include the IP Address of the responding DNS Server?
What happens if the DNS Server times out or can’t connect?
On Sat, Nov 16, 2019 at 1:07 AM, Roger Pack notifications@github.com wrote:
OK with this diff https://gist.github.com/rdp/26f5cde11e8e886b71df5bd42a8730d0
running this:
require "socket"
client = TCPSocket.new("localhosttt", 1234)results in this message:
Unhandled exception: No address found for localhosttt:1234 over TCP when attempting a DNS lookup (Socket::Addrinfo::Error)
Is that better? Accurate? :)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
It is attempting to lookup the IP Address when it fails :) It'll have a different message if it can't resolve it, not sure what (basically the first part of that string will change...) :)
I mean of the DNS Resolver itself: For example if Google DNS 8.8.8.8 returns NXDomain then include in the exception data that 8.8.8.8 says NXDomain.
On Sat, Nov 16, 2019 at 1:35 AM, Roger Pack notifications@github.com wrote:
It is attempting to lookup the IP Address when it fails :) It'll have a different message if it can't resolve it, not sure what (basically the first part of that string will change...) :)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Unfortunately it's just making a call out to "C code land" so it doesn't know exactly which DNS it may end up using, it just uses whatever the kernel does underneath... :)
Unfortunately it's just making a call out to "C code land" so it doesn't know exactly which DNS it may end up using, it just uses whatever the kernel does underneath... :)
Does the kernel not give detailed responses? How does the C program dig know which server responded from the kernel?
% dig crystal-lang.org
; <<>> DiG 9.10.6 <<>> crystal-lang.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63949
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;crystal-lang.org. IN A
;; ANSWER SECTION:
crystal-lang.org. 59 IN A 13.32.238.247
crystal-lang.org. 59 IN A 13.32.238.64
crystal-lang.org. 59 IN A 13.32.238.160
crystal-lang.org. 59 IN A 13.32.238.38
;; Query time: 31 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sat Nov 16 13:11:00 EST 2019
;; MSG SIZE rcvd: 109
dig is probably not using getaddrinfo (that crystal uses), it's probably
hitting the DNS servers directly, itself, instead of having the kernel look
it up for it...
On Sat, Nov 16, 2019 at 11:11 AM Nathaniel Suchy notifications@github.com
wrote:
Unfortunately it's just making a call out to "C code land" so it doesn't
know exactly which DNS it may end up using, it just uses whatever the
kernel does underneath... :)Does the kernel not give detailed responses? How does the C program dig
know which server responded from the kernel?% dig crystal-lang.org
; <<>> DiG 9.10.6 <<>> crystal-lang.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63949
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;crystal-lang.org. IN A;; ANSWER SECTION:crystal-lang.org. 59 IN A 13.32.238.247crystal-lang.org. 59 IN A 13.32.238.64crystal-lang.org. 59 IN A 13.32.238.160crystal-lang.org. 59 IN A 13.32.238.38
;; Query time: 31 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sat Nov 16 13:11:00 EST 2019
;; MSG SIZE rcvd: 109—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/crystal-lang/crystal/issues/8262?email_source=notifications&email_token=AAADBUFP5HURZN7FX76L3WDQUAZU7A5CNFSM4I45DEBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEHXHNA#issuecomment-554660788,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAADBUBNYMFLRQRJ4SOB72TQUAZU7ANCNFSM4I45DEBA
.
dig is probably not using getaddrinfo (that crystal uses), it's probably hitting the DNS servers directly, itself, instead of having the kernel look it up for it...
…
On Sat, Nov 16, 2019 at 11:11 AM Nathaniel Suchy @.*> wrote: Unfortunately it's just making a call out to "C code land" so it doesn't know exactly which DNS it may end up using, it just uses whatever the kernel does underneath... :) Does the kernel not give detailed responses? How does the C program dig know which server responded from the kernel? % dig crystal-lang.org ; <<>> DiG 9.10.6 <<>> crystal-lang.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63949 ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;crystal-lang.org. IN A ;; ANSWER SECTION:crystal-lang.org. 59 IN A 13.32.238.247crystal-lang.org. 59 IN A 13.32.238.64crystal-lang.org. 59 IN A 13.32.238.160crystal-lang.org. 59 IN A 13.32.238.38 ;; Query time: 31 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Sat Nov 16 13:11:00 EST 2019 ;; MSG SIZE rcvd: 109 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#8262?email_source=notifications&email_token=AAADBUFP5HURZN7FX76L3WDQUAZU7A5CNFSM4I45DEBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEHXHNA#issuecomment-554660788>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADBUBNYMFLRQRJ4SOB72TQUAZU7ANCNFSM4I45DEBA .
How does it know my system's DNS resolver is 8.8.8.8?
I don't know it probably looks it up using some command line program or config files etc? https://unix.stackexchange.com/questions/28941/what-dns-servers-am-i-using/28958 You could check its source and tell us! :) Or is there a better error message possible maybe?
I don't know it probably looks it up using some command line program or config files etc? https://unix.stackexchange.com/questions/28941/what-dns-servers-am-i-using/28958 You could check its source and tell us! :) Or is there a better error message possible maybe?
What about cases where you have more than one DNS Resolver configured? What if 8.8.8.8 doesn't respond to 8.8.4.4 is required? Config files can't tell you that.
I presume that 'command line programs' can (nmcli) but not sure how dig discovers its list...per se. Anyway I don't think there's an easy C "method call" to lookup the DNS servers, and it's so far distant from the actual c call getaddrinfo (unfortunately) that I still think it's going to be hard to actually add DNS info into the error message. At least for me... :) Let me know if the previously proposed is good enough and I could do a PR for it :)
I think it solves the problem but a larger discussion around a reimplementation is necessary. Perhaps we could borrow some functions from dig in the future?
On Sat, Nov 16, 2019 at 8:17 PM, Roger Pack notifications@github.com wrote:
I presume that 'command line programs' can (nmcli) but not sure how dig discovers its list...per se. Anyway I don't think there's an easy C "method call" to lookup the DNS servers, and it's so far distant from the actual c call getaddrinfo (unfortunately) that I still think it's going to be hard to actually add DNS info into the error message. At least for me... :) Let me know if the previously proposed is good enough and I could do a PR for it :)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
If getaddrinfo eventually has drop-in replacements at some point that themselves lookup the DNS servers then yeah, we'd have access to more info to return. refs: https://github.com/crystal-lang/crystal/issues/8376 #4236 #2660 https://stackoverflow.com/a/2157622/32453
My latest incantation is this message:
getaddrinfo: No address found for badhostname:80 over IP when attempting host address resolution. Hint: check hostname, check resolution system (including DNS). (Socket::Addrinfo::Error)
("including" since it can also includes .local domains, etc/hosts, etc...)
Any objections? If not then I'll do a PR. Thanks!
That looks great. Nice work 😎
On Wed, Nov 20, 2019 at 5:35 PM, Roger Pack notifications@github.com wrote:
If getaddrinfo eventually has drop-in replacements at some point that themselves lookup the DNS servers then yeah, we'd have access to more info to return. refs: #8376 #4236 #2660 https://stackoverflow.com/a/2157622/32453
My latest incantation is this message:
getaddrinfo: No address found for badhostname:80 over IP when attempting host address resolution. Hint: check hostname, check resolution system (including DNS). (Socket::Addrinfo::Error)
("including" since it can also includes .local domains, etc/hosts, etc...)
Any objections? If not then I'll do a PR. Thanks!
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
How does the C program
digknow which server responded from the kernel?
@rdp is correct - dig contacts the DNS servers directly. getaddrinfo uses the hosts file, and several other configurable plugins (man nsswitch.conf) to resolve DNS. We don't want to switch away from getaddrinfo, since it would break configurations which use these plugins (docker). Please keep in mind that no DNS servers may be contacted when looking up an address (/etc/hosts).
How does it know my system's DNS resolver is 8.8.8.8?
Probably by reading /etc/resolv.conf. dig doesn't need to have robust detection of which DNS server to use, since it's only a debugging program and you're expencted to provide it with dig foo @8.8.4.4 if required.
A trick you might want to know when debugging DNS is to test if getaddrinfo and dig are agreeing. You can use getent hosts google.com to quickly look up a hostname with getaddrinfo.
Yeah, this is probably not what you expect from DNS resolution on UNIX, but it is what it is unfortunately.
OK this is live in 0.33.0 (I tried to add as much as they'd let me get away with :)
Why this is throwing a name resolution failure?
@thelinuxlich Probably because play.crystal-lang.org runs within a sandbox without access to the external network? Did you try to run this code on your local machine?