I do understand that mabe it's not the purpose of this lib (feel free to close in case).
But would be useful adding a dns cache layer.
It's one of the common problem nowadays with huge number of requests. It doesn't cache dns lookups or anything. Maybe it's worth just mentioning this in Readme.
What's the cleanest way to deal with it?
Node.js has ability to get the TTL of DNS lookups: https://github.com/nodejs/node/pull/9296 So we could potentially implement a simple automatic cache based on that.
Some prior art (which all seems to require setting TTL manually):
@sindresorhus I would like to work on that
Also, don't you want to set TTL and size manually?
Or maybe for requests which are used often we can automatically increase TTL
I think the ideal implementation would be an _automatic_ TTL cache by simply respecting the TTL of the DNS record itself. If the upstream DNS server(s) are respecting TTL to spec, they will be caching for the same TTL meaning a value lower than set on the DNS record is unnecessary.

As for a size limit, I don't think it's necessary. Absolute worst case, a domain (63 characters) and IPv6 address (128 bit) pair are represented in 79 bytes. That means you can store 12,658 cached DNS entries per megabyte of memory. In practice, you'd get 25,000+ since domains are shorter. For long running processes, we _could_ implement an asynchronous tick (e.g. setInterval) to clean out expired entries every few minutes if it becomes absolutely necessary.
If we implement it similar to request caching, the user can bring their own cache if they need to manage it more closely. For instance, passing in a custom instance of keyv enabling them to clear entries as necessary.
We should however offer an option to disable it. This leaves us with two modes:
Just my $0.02.
btw can anyone assign this issue to me, because I'm already working on it (if it's necessary)
ok, so the cache is almost ready but I don't know what to do with resolving localhost type of addresses, please check implementation docs on node dns.lookup/dns.resolve*
So my plan was to use dns.resolve4/dns.resolve6, but this methods are not checking etc.hosts, and I'm worried that if someone will set custom hosts there they will not be resolved. Any ideas?
Wouldn't it make sense to simply accept a lookup function like the Node APIs do? Then you can use one of the existing DNS cache implementations like redns?
@pietermees default dns.lookup method is not supporting providers ttl which we want to support.
Also, we want to support different types of storage (i.e. keyv).
About dns.resolve(), dns.resolve*()
These functions are implemented quite differently than dns.lookup(). They do not use getaddrinfo(3) and they always perform a DNS query on the network. This network communication is always done asynchronously, and does not use libuv's threadpool.
As a result, these functions cannot have the same negative impact on other processing that happens on libuv's threadpool that dns.lookup() can have.
They do not use the same set of configuration files than what dns.lookup() uses. For instance, they do not use the configuration from /etc/hosts.
For more info please refer to this doc.
And there is definitely no point to add another dependency because it's not so much code to do.
@morozRed sorry, I should have been clearer. I meant to ask why DNS caching should be part of got rather than an external thing you can plug in.
The reason why I ask is that Node APIs already allow overriding DNS lookups on TCP/HTTP/HTTPS/H2 connects by providing a custom lookup function. Existing packages like redns and other leverage that to provide DNS caching capabilities similar to what you're trying to implement here.
My main point is that adding a lookup option to got similar to the lookup option in socket.connect. That would allow you to easily plug in any of the existing DNS cache packages.
@pietermees oh ok, well the idea was to make automatic pre-initialized DNS cache, but still, you already able to do this:
await got('google.com', { lookup: CUSTOM_DNS_CACHE.lookup });
Having played around with some of the DNS cache implementations that are around, I think you'll find that you can't easily provide a default DNS cache that is fully transparant (i.e. handles DNS exactly the same way as the system would).
As you've indicated you have to use the dns.resolve API which behaves differently than getaddrinfo. This impacts /etc/hosts but also IPv6 vs IPv4 preferences etc.
With that in mind, I don't think it's possible to provide an implementation that's enabled by default and does not change semantics for all got users.
Hence my suggestion to use the lookup option (I didn't know that already worked, thanks!).
lookup functionok so @sindresorhus @brandon93s what do you guys think about it? Maybe I can update the docs here at least?
What should we do when the DNS server offers us multiple IP addresses? Should we cache them?
@szmarczak you can cache all the addresses and then use round-robin strategy. But I'm not sure now if we do need to implement custom cache if you can simply pass package or custom lookup function to got.
you can simply pass package or custom lookup function to got
Well, there isn't one which will set TTL automatically... We need to create it :)
@szmarczak this package supports provided DNS TTL. I've been working on cache for got but then I got stuck because of this discussion.
Well, it sounds good... but it isn't. It's way too bloated. Is there any way to manage the database? I'll try to make a prototype now :)
It's way too bloated.
To be fair, that's mostly because it pulls in async (to get IPv4 and IPv6 stuff in parallel) and lodash (to do input validation). If the maintainer's willing, it wouldn't be that hard to pull those out.
@sindresorhus Any ideas on how to implement cacheable-lookup into Got? Should we reuse options.cache or use options.dnsCache instead?
options.dnsCache
Most helpful comment
I think the ideal implementation would be an _automatic_ TTL cache by simply respecting the TTL of the DNS record itself. If the upstream DNS server(s) are respecting TTL to spec, they will be caching for the same TTL meaning a value lower than set on the DNS record is unnecessary.
As for a size limit, I don't think it's necessary. Absolute worst case, a domain (63 characters) and IPv6 address (128 bit) pair are represented in 79 bytes. That means you can store 12,658 cached DNS entries per megabyte of memory. In practice, you'd get 25,000+ since domains are shorter. For long running processes, we _could_ implement an asynchronous tick (e.g.
setInterval) to clean out expired entries every few minutes if it becomes absolutely necessary.If we implement it similar to request caching, the user can bring their own cache if they need to manage it more closely. For instance, passing in a custom instance of
keyvenabling them to clear entries as necessary.We should however offer an option to disable it. This leaves us with two modes:
Just my $0.02.