Caddy: Implement ETag

Created on 18 Jul 2015  路  11Comments  路  Source: caddyserver/caddy

ETags aren't my favorite caching mechanism because all the heavy lifting is done by the server, but it might be worth looking into doing them anyway. Do we do strong ETags or weak ones? Should this be customizable from the Caddyfile? What are the default settings? How do we cache the computed ETags?

The Go standard library supports ETag and If-Modified-Since headers, but we need to do our part to take advantage of them.

Most helpful comment

I've got the start of something for this. Hope to submit before the weekend is out. :)

All 11 comments

Let's get inspired by nginx having its ETags looking like: "554b73dc-6f0d" 1st part is last modification timestamp and 2nd is file size. Both you have just when opening file, no computation needed, so caching this makes no sense.

That's a good idea. I'm down for simple and effective. I need to look into this more, but why is the modified date and file size both used? Are both needed?

In fact last_mod should change anytime you write something... but that's in seconds... in one second you can write few hundreds megs. Found it here: http://serverfault.com/questions/690341/algorithm-behind-nginx-etag-generation

That makes sense. I need to figure out how to best integrate this with a "cache" or "caching" directive (which one?)

As far as I remember ETag is not conditionally send on caching headers send by client. It can be send together with _modified_ headers. Primary idea in RFCs was to send MD5 of content, but that's heavy to compute especially for bigger files. Btw nginx generate this just for static content and when you look at docs, it is just on/off tag in context of http or server or location.

So maybe it's a good idea to just send ETag from within the file server by default. That way, the client may use it but it doesn't have to. I'll need to investigate more if other caching related headers should be used similarly.

It's quite tricky. Client (browser) would work OK, but there would be some Squiq (whatever proxy) in between and there can be confusion. It is worth to be able switch on/off in config and may be best by URI. ETag comes usually with something like _expires_ in nginx, that add/append_to header Cache-Control: max-age=86400 (1d for example). If you are doing site deploy once a day, then tell explicitly you static content can change at most once a day. Then there are _public_ and _private_ modifiers, that affects proxies behaviour. Hence the name _private_ is evaluated in browser only _public_ also on proxies.

I would suggest a two-part E-TAG using the stat object and running a fast hash (such as BLAKE2) against it.

Part 1:

  • mtime - modified at
  • ctime - created at
  • mode - permissions
  • size - number of bytes in the file

Part 2 (optional):

Optimistic Approach: If it is suspected that the file has changed (because the values above don't match), read the file disk and hash it. You could use a sparse hash rather than a complete hash to save disk access / cpu usage.

Pessimistic Approach: Hash the whole file every time.

Create the etag

etag := 'W/"' + part1[0:8] + '-' + part2[0:8] + '"'

Here are the docs

Unfortunately go doesn't return ctime ... that's annoying.

I'm optimistic that Name(), Size(), Mode(), and ModTime() are good enough. Since this isn't an issue of cryptographic security (or at least I think if you've got someone on your server fudging bad files to cause cache validation errors, you've got WAY bigger problems), it's a tradeoff between being perfectly accurate and accurate in most cases (reverting a change to a file which causes a different date, and usually a different file size).

Point me at the right file and there's a strong likelihood I can get you a pull request.

Hashing files will be a no-go on installations with lots of, or some very large files: It will slow down the start considerably.

I've got the start of something for this. Hope to submit before the weekend is out. :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

klaasel picture klaasel  路  3Comments

xfzka picture xfzka  路  3Comments

jgsqware picture jgsqware  路  3Comments

SteffenDE picture SteffenDE  路  3Comments

crvv picture crvv  路  3Comments