Irssi: IPv6 and IPv4 addressing order (RFC 6724) not respected

Created on 15 Dec 2014  路  35Comments  路  Source: irssi/irssi

irssi seems to default to IPv4 addresses before IPv6. That's wrong: IPv6 should always be tried first, to help with the conversion to IPv6.

In fact, irssi should just respect the order provided by getaddrinfo(3). It is already configured to use IPv6 first if both sides have native IPv6, but use IPv4 first if both sides are tunnelling. See RFC 6724 (the new version of RFC 3484), which is implemented in most libc.

bug ipv6

All 35 comments

Hey Thiago! :-)

We currently have a setting named resolve_prefer_ipv6 which is (still) set to false as default. This setting was defined back in 2001 and I agree that it should be switched to true as default.

Hey Alex

You're missing the point. That setting shouldn't exist at all, since libc already has the setting and it's system-wide.

Example:

import socket
import pprint
pprint.pprint(socket.getaddrinfo("chat.freenode.net", 7000, socket.AF_UNSPEC, socket.SOCK_STREAM))

Output when I have a native IPv6 connection:
[(10, 1, 6, '', ('2610:150:4b0f::', 7000, 0, 0)),
(10, 1, 6, '', ('2001:778:627f::1:0:49', 7000, 0, 0)),
(10, 1, 6, '', ('2001:1af8:4700:b030::10', 7000, 0, 0)),
(10, 1, 6, '', ('2001:708:40:2001:a822:baff:fec4:2428', 7000, 0, 0)),
(10, 1, 6, '', ('2a02:2498:1:3a3:6ef0:49ff:fe44:bc07', 7000, 0, 0)),
(2, 1, 6, '', ('83.170.73.249', 7000)),
(2, 1, 6, '', ('31.13.222.109', 7000)),
(2, 1, 6, '', ('84.240.3.129', 7000)),
(2, 1, 6, '', ('82.96.64.4', 7000)),
(2, 1, 6, '', ('195.148.124.79', 7000)),
(2, 1, 6, '', ('91.217.189.42', 7000)),
(2, 1, 6, '', ('192.186.157.43', 7000)),
(2, 1, 6, '', ('193.219.128.49', 7000)),
(2, 1, 6, '', ('37.48.83.75', 7000)),
(2, 1, 6, '', ('64.32.24.176', 7000)),
(2, 1, 6, '', ('195.154.200.232', 7000)),
(2, 1, 6, '', ('185.30.166.38', 7000)),
(2, 1, 6, '', ('91.217.189.44', 7000)),
(2, 1, 6, '', ('174.143.119.91', 7000))]

Output of the same machine after connecting to a network without native IPv6:
[(2, 1, 6, '', ('174.143.119.91', 7000)),
(2, 1, 6, '', ('64.32.24.176', 7000)),
(2, 1, 6, '', ('82.96.64.4', 7000)),
(2, 1, 6, '', ('31.13.222.109', 7000)),
(2, 1, 6, '', ('91.217.189.42', 7000)),
(2, 1, 6, '', ('195.148.124.79', 7000)),
(2, 1, 6, '', ('83.170.73.249', 7000)),
(2, 1, 6, '', ('84.240.3.129', 7000)),
(2, 1, 6, '', ('185.30.166.38', 7000)),
(2, 1, 6, '', ('195.154.200.232', 7000)),
(2, 1, 6, '', ('193.219.128.49', 7000)),
(2, 1, 6, '', ('37.48.83.75', 7000)),
(2, 1, 6, '', ('192.186.157.43', 7000)),
(2, 1, 6, '', ('91.217.189.44', 7000)),
(10, 1, 6, '', ('2001:708:40:2001:a822:baff:fec4:2428', 7000, 0, 0)),
(10, 1, 6, '', ('2001:1af8:4700:b030::10', 7000, 0, 0)),
(10, 1, 6, '', ('2a02:2498:1:3a3:6ef0:49ff:fe44:bc07', 7000, 0, 0)),
(10, 1, 6, '', ('2001:778:627f::1:0:49', 7000, 0, 0)),
(10, 1, 6, '', ('2610:150:4b0f::', 7000, 0, 0))]

In fact, I didn't even quit the python instance. The sorting criteria were updated behind the scenes by glibc itself.

What's more the resolve_prefer_ipv6 setting is broken because it doesn't _prefer_ IPv6, it uses IPv6 _only_. After setting that and connecting to a network without IPv6:

/reconnect
10:39 -!- Irssi: Looking up chat.freenode.net
10:39 -!- Irssi: Reconnecting to chat.freenode.net [2a02:2498:1:3a3:6ef0:49ff:fe44:bc07] port 7000 - use /RMRECONNS to abort
10:39 -!- Irssi: Unable to connect server chat.freenode.net port 7000 [Network is unreachable]
/reconnect
10:43 -!- Irssi: Removed reconnection to server chat.freenode.net port 7000
10:43 -!- Irssi: Looking up chat.freenode.net
10:43 -!- Irssi: Reconnecting to chat.freenode.net [2001:1af8:4700:b030::10] port 7000 - use /RMRECONNS to abort
10:43 -!- Irssi: Unable to connect server chat.freenode.net port 7000 [Network is unreachable]

Good point. Maybe it is time for us to reconsider this approach. The internet's slightly different now than it was back in 2001.

IMO we should remove resolve_prefer_ipv6 (and maybe resolve_reverse_lookup at the same time because that's a really weird option too and makes this easier). This will mean adjusting the net_* APIs as currently they return one address at a time rather than a list (may affect plugins too?). The random ordering of DNS responses should be removed at the same time (i.e. trust getaddrinfo) and don't attempt to randomise in Irssi, because that's also broken.

Some upgrade code that warns people the behaviour may now be different might be an idea too.

Indeed, while getaddrinfo(3) won't randomise, the DNS reply itself is already randomising the results.

If nscd is running, then the DNS request isn't retried until TTL expires, so you get the same order in sequential invocations, but IRC networks often have low TTLs.

[A rather tangent from the actual issue but I figure this may as well be written down for later.]

I was more thinking about /etc/gai.conf (see http://mirrors.bieringer.de/Linux+IPv6-HOWTO/resolver.html, at least for Linux, not sure how other OSes implement that RFC). The point is it's never correct to reorder getaddrinfo output. That there are upstream caches which may or may not reorder records is not relevant (although while I understand there are hysterical API related raisins it's odd a random dnsmasq instance running on a consumer router will reorder records while nscd doesn't).

The problem with setting resolve_prefer_ipv6 is that if you're on a laptop and travel a lot, some networks will have IPv6 and some will not. Then you end up just passing -4 for your /connect command.

I recently discovered that resolve_prefer_ipv6 will try IPv6 not just when you have no v6 address, but _even if_ IPv6 is disabled system-wide (via net.ipv6.conf.all.disable_ipv6). Despite being named "prefer", the setting seems to make irssi never try IPv4 at all.

Please remove this setting.

Right, it needs to respect the order from getaddrinfo. And it needs to try multiple servers returned from that function, not just the first one.

So, when are you guys going to fix this? It's been nearly 6 years.

I understand the frustration about long open bugs. Unfortunately we don't have any "guy" who is capable of fixing the bugs. :-(

Proper networking support should be at the basis of any networking application. I understand that irssi was not properly architected when it was first written, but we're in 2020 and IPv4 addresses have begun running out for over a decade. This issue makes an unpatched irssi unusable on an IPv6-enabled network.

thanks for your comment, could you please open a pull request with your patch?

290 already exists. It's been reverted.

apparently because there were issues with it? it's been continuediin #299 but was never finished. I see that you did add some comments on that pull request as well, so the question remains if you have a working patch?

290 has been working for me for 6 years. In this time, I have not seen any issues that didn't exist before with irssi.

Irssi can't connect at all when the first result from getaddrinfo() fails to connect, but as far as I can tell that has always been the case.

what about your comment here https://github.com/irssi/irssi/pull/290#issuecomment-143254324

It's what I just said: as far as I can tell it's always been like that, even before this change. So it's not a regression.

did you evaluate if turning on resolve_prefer_ipv6 combined with the recently committed #1146 might be enough for a basic experience?

That option shouldn't exist. The order from getaddrinfo() must be obeyed.

If there is an option, the option should be a tri-state. For example, resolve_prefer of values "auto", "ipv6" and "ipv4", with the default being "auto" meaning no filtering or reordering.

Obeying the order from getaddrinfo() sounds problematic.

  1. It harms load balancing.
  2. If it's implemented before trying each IP in order if they're failing then you will be completely unable to connect ito a network if the first server happens to be down.

Some of my DNS resolvers also return cached records in the same order.

  1. Yes, you need to try each address returned by getaddrinfo, not just try the first one.

getaddrinfo implements the load balancing itself. Try it yourself: run getent ahosts chat.freenode.net multiple times in a row and you'll see the order of the entries change. I believe the order is being changed by the DNS resolver, so as you say some caching resolvers may return the same order multiple times. That's a bug in the resolvers, not in getaddrinfo, but in that case gethostbyname would also be affected.

It also obeys RFC 6724. If you have a problem with RFC 6724, please submit a new RFC to IETF with an update on why it should be done differently. I certainly disagree on the prefix length matching and I did reach out to one of the authors, but he disagreed with me. So if you want to apply any reordering, I wouldn't complain about re-sorting the entries in each block, but don't reorder the blocks themselves (if getaddrinfo determined that you should use IPv6 first, then don't override: it concluded that this machine is connected via native IPv6, not a tunnel). There may be multiple blocks, but usually two: IPv6 and IPv4.

If you decide to implement a preference setting, I recommend that configure the ai_family member of the hint passed to getaddrinfo: set it to AF_UNSPEC for "auto", AF_INET for "ipv4" and AF_INET6 for "ipv6".

2 is a design flaw in irssi, as @mikelward says. Therefore, it's not a reason not to use getaddrinfo.

It's not a bug in the resolver, it's an unfortunate feature. :P DNS records are unordered, not guaranteed to be randomly ordered.

Yeah, prefix length matching is what concerns me. If I getent ahosts chat.freenode.net, even if my resolver returns the results in a random order, getaddrinfo() will always put the address in 2a01::/16 after the two addresses in 2001::/16, because I'm in 2600::/8 and that's a closer match. If irssi uses getaddrinfo() order, I might randomly connect to one of those two servers, but never the third server.

For IPv6, because the world is divided into blocks, it actually works, except for the older "Next Level Allocations" from the late 1990s and early 2000s. Those either work or don't depending on where you are:

  • 2400::/9 (APNIC - Asia), 2600::/9 (ARIN - North America) - will select 2001::/16 ahead of the second group
  • 2800::/9 (LACNIC - Latin America), 2a00::/9 (RIPE - Europe), 2c00::/9 (AfriNIC - Africa) - will select 2001::/16 alongside the second group

So the actual side-effect is that Asia and North America select each other before the older allocation, then the older allocatiton, then the rest of the world. The rest of the world also selects each other ahead of the older allocation, then the older allocation alongside the rest of the world.

This can also be solved at the glibc level, by making it stop to try to match length when the remaining length is 8 bits or less.

Anyway, this is not an irssi problem. Every single IPv6-capable application is using getaddrinfo() to resolve and is seeing the same consequences, in all OSes.

I don't think this is an issue that we need to prioritize, and I don't think the fact that we violate this RFC is reason enough to prioritize it.

IPv6 was adopted on IRC long before it started getting any form of traction elsewhere because it had some immediate value add to the users: they could pick their own "witty" reverse DNS hostnames and shell server admins could allocate a large amount of IPv6 addresses with different funny looking hostnames to their servers. We have resolve_prefer_ipv6 because we want the user to have control over Irssi, because Irssi is largely used by technical people, and because Irssi is occasionally used on pretty bizarre systems where the resolver isn't configured correctly (although fortunately much less of an issue today than it was historically). Our defaults should be "reasonable" (hey, we are a terminal-based IRC client in 2020, hence the quotes) such that new users aren't scared away, of course. We should not alter behavior that may break user configurations unless we have a good reason to.

I also don't think we should reach out to the IETF about any of this. Far too many hours have been wasted in IETF spaces on IRC, and I also don't think it's fair to think that a small free software project have the capacity needed here. Had we been a professional free software project with paid staff, I would be fine with going down that rabbit hole.

If we are to do anything around the DNS layer in Irssi, it should be the following: migrate away from the libc DNS interfaces (the code for handling the blocking DNS calls in an async manner is a mess) and move over to a modern DNS stack, such as libunbound. The provided libc DNS interfaces, on POSIX('y) operating systems, are such a massive disappointment that it is not even worth dealing with at this point unless it gets a massive makeover.

And yes, if we move to libunbound, we should absolute have an option to specify an alternative /etc/hosts and /etc/resolv.conf for the Irssi users who wants to mess with that, but keep the default values "reasonable" such that they mimic the operating system that Irssi is running on :-)

Hope this doesn't come out as being too negative, but I do think it's reasonable for us to have the no-hat on here in terms of prioritizing this issue. If anybody else wants to dive into it, fix it, and think about the consequences of such fix for current user configuration, then they are more than welcome :-)

The only thing that needs to be prioritised is to use IPv6 first if I am on an IPv6-native connection and IPv4 first otherwise. How you find that out, I don't care. Without user input.

For example, you can decide to assume that no one has IPv6 tunnels any more, so any system with IPv6 addresses is natively connected.

And yes, if we move to libunbound, we should absolute have an option to specify an alternative /etc/hosts

is an alt hosts file something we miss today? See below
https://unix.stackexchange.com/questions/10438/can-i-create-a-user-specific-hosts-file-to-complement-etc-hosts

sadly, even in 2020 the only ipv6 I have is a tunnel

sadly, even in 2020 the only ipv6 I have is a tunnel

Indeed, which is why simply assuming that any IPv6 is a native one is a poor assumption. Works for me, but not for @ailin-nemui.

You can determine if it is a global address if it is 2000::/3 but not 3ffe::/16 or 2002::/16. If the source and remote addresses are fc00::/7, then IPv6 should be used ahead of IPv4, since those are not public addresses (could be used for IRC servers inside big companies).

In other words, the logic that getaddrinfo already has.

I'd be against any ipv6 related improvements before the issues mentioned here are addressed properly: https://github.com/irssi/irssi/pull/1146#issuecomment-563727829

I'd be against any ipv6 related improvements before the issues mentioned here are addressed properly: #1146 (comment)

Not only are the two things mutually exclusive, they are actually the same solution. Proper IPv6 support requires getting a list of results from getaddrinfo and iterating over that list until it's exhausted. That's actually the only thing I'm asking in this bug report.

A "Happy Eyeballs" solution might actually try more than one entry from the addrinfo list at the same time and use the first one that connects. I'm not asking for that.

I meant mostly the 5 minute reconnection timeout. I believe suggestion 1 in that comment suggests scanning through the getaddrinfo response and connecting to all of them without reconnection delays in the middle, which implies a very different flow.

We could also keep the reconnection delays and just do exponential backoff, I guess. It's very weird to me that we don't do exponential backoff.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

fratertenc picture fratertenc  路  7Comments

wodim picture wodim  路  10Comments

ailin-nemui picture ailin-nemui  路  17Comments

redconnection picture redconnection  路  19Comments

CatPlanet picture CatPlanet  路  5Comments