Okhttp: Unexpected automatic (local) caching of requests without query strings

Created on 13 Feb 2018  路  10Comments  路  Source: square/okhttp

First of all, I'm not sure if this is really a bug or intentional behavior, but here's a test anyway: https://gist.github.com/apottere/93e89c7645c1b2e9f97402b10507d574

We decided to switch from Apache HC to OkHttpClient in our rest APIs to take advantage of better pooling/connection re-use. We're using the Spring Boot starter to create a RestTemplate with which we call backend REST APIs (java -> java). Everything worked great out-of-the-box, until we noticed that one of our services wasn't getting called at all.

We did some digging and realized that it's the only service that uses ETag and Last-Modified to support caching of requests with conditional gets, but it does not set any other cache-control headers. We found that our clients were no longer sending more than one request to the API because of this block of code.

The provided test shows that a request for a URL - given a previous response for that url with ETag and Last-Moidified headers returned - will hit that block of code, and one of two things will happen:

  1. If the request has no query parameters, it will return the cached response without contacting the server
  2. If the request has at least one query parameter, it will send a request to the server with a conditional get.

With respect to the comment on the offending block of code, it could be reasonable that a browser cache pages without query parameters for some time if they've sent a Last-Modified header, but it seems completely unexpected for that to be default functionality in a REST client.

Would it be possible to make this behavior opt-in instead of opt-out?

enhancement

All 10 comments

Some options:

  • Don鈥檛 use response caching. If you don鈥檛 want your REST client to cache results, don鈥檛 enable it?
  • Change the service to explicitly set a cache-control header so that OkHttp doesn鈥檛 need to use heuristic caching. More on heuristic caching here.
  • Use a network interceptor to rewrite cache headers. Perhaps strip Last-Modified or adding Cache-control: no-cache.

I鈥檓 sorry our behavior is surprising. We鈥檙e expecting that most services have many clients (apps, services, browsers) and so HTTP-spec compliant caching is the best policy.

Thanks for the quick response!

  1. We do want caching enabled, we just don't want heuristic freshness - we want the request to be validated with the ETag
  2. This is what we've decided to to moving forward (max-age=0), since we control the service as well as the client
  3. I thought about this, but it seems like a less robust solution than (2)

I read through RFC 2616, 7232, and 7234 to familiarize myself, and found two things that are relevant to this issue:

  1. Servers are encouraged to send Last-Modified if they have it, regardless of the other cache headers or validators they're sending
  2. Heuristic caching is something the client MAY do, the only limit is on the length of the "freshness lifetime" if the client chooses to behave that way.

I would argue that when an HTTP client is used in a REST client, heuristic caching is always unexpected behavior, and should be disabled - especially when you don't control the API. Would it be possible to at least add an option to the Cache constructor to disable it?

It's a good argument.

My gripe is that I don't like the idea of HTTP clients having privileged information about the servers they're hitting. Servers should expect their responses to be cached according to their cache headers. Particularly because browsers don't have the knobs to disable caching.

OkHttp is a REST client, but it doesn't know that. It can also be used to download static resources, crawl the web, etc.

Finally- in the REST client use case there's a handy solution. Specify a cache-control header for each request. There's even a constant in CacheControl for no caching.

I agree that servers should be assuming their clients cache according to the RFCs, but as a client I don't think disabling heuristic freshness is violating that contract. In the event that the server sends explicit caching information, I would like to respect that - that's why I'm hesitant to apply a no-cache header to every request sent from our client.

OkHttp doesn't know it's a REST client out of the box, I'm just asking for a button to press to let it know that in this instance it'll only be used as a REST client. If a server is ambiguous about its caching settings it should choose to validate cached responses whenever possible (which is its prerogative based on the RFCs).

Gotcha. Yeah, the easiest way to do that is with either a Cache-Control: no-cache header that you manually add to your requests, or with an interceptor that does that for you.

I don鈥檛 wanna add an OkHttpClient-side setting because then we鈥檇 have to decide whether that has precedence over a request鈥檚 cache-control preferences. And I鈥檓 allergic to precedence rules.

I guess I'm not understanding why there's a precedence - the section of code that calculates the heuristic freshness of a cached response only applies when the client has already decided that the server didn't give any specific caching rules when the response was cached. I'm simply asking for a way to turn that optional calculation off. Won't adding Cache-Control: no-cache send requests even when the server has specified a max-age?

Understood.

I'm still of the opinion that a server that's sending a Last-Modified header is hinting what caches should do, even if it isn't as explicit as a Cache-control header.

I鈥檓 reluctant to have knobs to disable heuristic caching because heuristic caching is a widely-deployed feature and where it's broken I think it's the servers that should be fixed.

I just want to make sure my client isn't misbehaving because there's no way to disable heuristic caching besides disabling caching completely on every request.

We were able to fix this issue because we own the service, but I can't guarantee that dependencies we don't own will always behave correctly.

I understand that heuristic freshness is a widely-deployed feature, and it's the server's responsibility to make sure clients don't use it, but in the event that a dependency we don't control begins sending incorrect headers, I don't want my application to stop working because of optional behavior in the REST client that we never want to happen.

AFAICT that's the only section of the http caching RFC that uses the language MAY, so it doesn't seem too unreasonable to want a toggle for turning that functionality on and off.

Won't fix.

Was this page helpful?
0 / 5 - 0 ratings