@drexin and I have been looking into an issue we experienced where we ran into a BufferOverFlow exception when using a cachedHostConnectionPool. Our software has its own concurrency limiter, which was enforcing that no more than 50 requests would be active at once, and despite having the max-connections and max-open-requests values configured to accommodate this (50, and 64 respectively), we still saw this exception. Here's one specific example, for reference:
akka.stream.BufferOverflowException: Exceeded configured max-open-requests value of [64]. This means that the request queue of this pool (HostConnectionPoolSetup(REDACTED_HOSTNAME,443,ConnectionPoolSetup(ConnectionPoolSettings(50,0,5,64,1,30 seconds,ClientConnectionSettings(Some(User-Agent: akka-http/10.0.4),10 seconds,1 minute,512,None,<function0>,List(),ParserSettings(2048,16,64,64,8192,64,8388608,256,10485760,Strict,RFC6265,true,Full,Error,Map(If-Range -> 0, If-Modified-Since -> 0, If-Unmodified-Since -> 0, default -> 12, Content-MD5 -> 0, Date -> 0, If-Match -> 0, If-None-Match -> 0, User-Agent -> 32),false,akka.stream.impl.ConstantFun$$$Lambda$711/1719727892@7bd99559,akka.stream.impl.ConstantFun$$$Lambda$711/1719727892@7bd99559,akka.stream.impl.ConstantFun$$$Lambda$712/1404150776@2e95f241))),akka.http.scaladsl.HttpsConnectionContext@6fcfc89e,akka.event.MarkerLoggingAdapter@2c4f232d))) has completely filled up because the pool currently does not process requests fast enough to handle the incoming request load. Please retry the request later. See http://doc.akka.io/docs/akka-http/current/scala/http/client-side/pool-overflow.html for more information.
We've tracked this problem down to a behavior in akka-http when handling stream cancellation. In our use case, the entity streams were very large, and/or slow, and could take many hours to fetch. In some cases, we'd cancel these streams mid way through, and expected that the HTTP request was also cancelled and that the request "slot" was immediately available for re-use.
It turns out that akka-http continues to stream the dataBytes on a cancelled connection, effectively using up request "slots" which the user might otherwise reasonably expect to be available to initiate a new request.
One specific line of code involved in this process is OutgoingConnectionBlueprint.scala.
It seems this behavior is predicated on the cost of closing and re-opening a connection being lower higher than the cost of reading the rest of the data. There are many cases (such as ours) where that's not going to be the case.
This behavior would appear to make it impossible for a user to correctly manage the number of active requests within a cachedConnectionPool. Once the stream is cancelled, I have no further visibility into the activity within the pool -- I can't know when that request has finally completed, and that it's safe to issue another request. At best, a user must initiate a new request, and be prepared to handle the BufferOverflowException, and the back off -- but this doesn't seem very "reactive". Perhaps there is another solution we've overlooked?
Hi @jamesmulcahy and @drexin. Thanks a lot for this detailed investigation. I'll try to have a deeper look at this issue later this week.
IMO there's no reason to keep the connection open longer when the user has explicitly cancelled a response stream (at least with HTTP pipelining disabled). The user can always attach a Sink.ignore if the current behavior is actually intended. I agree we should fix that one.
Regarding the BufferOverflowException, instead of manually counting open requests, the recommended pattern is to attach a source in front of the connection pool which gets pulled when capacity is available. That way you should never get a BufferOverflowException. I understand that it is sometimes hard to supply a source up front (instead of just dispatching requests using Http().singleRequest for example). How do you use the pool?
Hey @jrudolph,
we have some deduplication logic and other things in front and we use the pool with a Source.single. In our case it was much easier to do it that way. We could of course change that, but the problem with the connections not being closed still remains.
The easiest way to reproduce this, is to request a large file and then cancel the response.entity.dataBytes source (i.e. connect it to a Sink.cancelled). The connection will stay open and receive data, until it read the whole entity. Using an Http().outgoingConnection shows the same behavior.
Maybe we could disable this behavior in case we set Connection: close or have a different option to disable this?
I see this as a bug. I don't think there's a good reason to drain slow response bodies indefinitely.
Picking this up now - thanks for the great detective work guys.
We looked into this a few days and conclude that we're not able to reproduce.
Some attempts to see in https://github.com/akka/akka-http/pull/1058 however the test there is wrong.
@jrudolph will submit https://github.com/akka/akka-http/compare/master...jrudolph:pull/1058?expand=1 with additional tests so we have coverage for the exact behaviour that is both desired and seems to work right now.
We'll keep this issue open for now, if you have some way of reproducing as a scalatest that would help a ton (see https://github.com/akka/akka-http/compare/master...jrudolph:pull/1058?expand=1 that shows that it seems to work correctly already).
Possible reproducer given in #1023
I wonder if we should still tread #1023 as a duplicate / reproducer as the observations seem to be slightly different.
Closing for now for two reasons: we couldn't reproduce the exact issue and similar issues will be fixed with the new client pool.