Apollo-ios: HTTPNetworkTransport's URLSessionClient possible [ lost calls / memory leak ]

Created on 13 Aug 2020  ยท  11Comments  ยท  Source: apollographql/apollo-ios

Hi team!
First of all, thanks for an awesome framework that makes the lives of so many of us that much easier ๐Ÿ˜„

We have run into a [lost calls / memory leak] issue.
We believe it is closely linked to issue https://github.com/apollographql/apollo-ios/issues/1292.

SETUP

Dependency manager: SPM
Apollo-iOS version: 0.30.0
Setup: Instanced (not a Singleton)

Also we provide a URLSessionClient that runs on a background callbackQueue to the HTTPNetworkTransport:

URLSessionClient(
    sessionConfiguration: URLSessionConfiguration.background(withIdentifier: "background"),
    callbackQueue: .some(operation)
)

INTENDED OUTCOME

We expect that the calls of all instanced ApolloClients return successfully.
We also expect that ApolloClients that are not retained in our code get deallocated from the system memory.

ACTUAL OUTCOME

We first noticed that for some reason, new instances of ApolloClient (those created after the first instance is not retained in our code anymore) never return from their calls This has made us notice that the created URLSessionClients, HTTPNetworkTransports and ApolloClients are never deallocated properly causing a memory leak.

REPRODUCTION STEPS

1 - Make a call with the setup described above.
2 - Remove the retained instance of the ApolloClient.
3 - Create a new ApolloClient and make a new call.
4 - Call does not come back / memory leak occurs.
5 - Repeat from step 2 to create a new unwanted retained instance of URLSessionClient, HTTPNetworkTransport and ApolloClient.

OUR CONCLUSIONS

This part should be taken with a grain of salt as we are not Apollo experts. That said, it is the only conclusion that survived our tests. Having found this https://stackoverflow.com/questions/62612056/why-dispatchqueue-global-still-alive-after-deinit on StackOverflow, we think that the problem stems from the queue keeping alive the instance of the URLSession created in URLSessionClient in some fashion.

This causes multiple retains climbing up the dependence graph of:

  • URLSession
  • URLSessionClient
  • HTTPNetworkTransport
  • ApolloClient
  • Objects in our code that calls ApolloClient ...

OUR SOLUTION

Adding the following to HTTPNetworkTransport fixes both our issues:

deinit {
  self.client.session.invalidateAndCancel()
}

We understand that this is most probably not the solution you will want to implement but it might help as a starting point to formulate something that will not jeopardize others that use your framework.

OUR INVESTIGATION

It should be said that we took the liberty to try this as there was code in the URLSessionClient for deinit that clears all tasks. We surmised that clearing the tasks in that situation did not cause a problem for calls in progress, meaning that clients of the framework should not expect the calls to complete if they stop holding their instance of ApolloClient.

But we noticed that the deinit in the URLSessionClient was never called even when its parent HTTPNetworkTransport deinit was called. We then noticed that the HTTPNetworkTransport was never deallocated even though its deinit was called.

That made us realize that something further down the line was probably holding things recursively. Implementing our solution, everything gets deallocated and for some reason, further ApolloClients return their calls successfully.

CONCLUSION

Advice on how to go forward from here would be greatly appreciated.
Thanks in advance for your help. We stand ready to answer any of your questions should you have some.

networking-stack

Most helpful comment

@EggYoke The sample you sent was very helpful, thank you. The underlying problem was a retain cycle as outlined in #1366.

The secondary problem is that you're attempting to create multiple background sessions with the same identifier at the same time, which the system won't let you do. I found that after the changes in #1366 were applied, I was able to get things to work with a background session by just using UUID().uuidString as the identifier, as this prevents any collisions in session name.

Honestly, I wouldn't really recommend using background sessions for everything as it's kind of overkill, but if you need to for reasons I'm missing, that should at least unblock you.

All 11 comments

Hi! Thanks for the extremely detailed report. Very, very odd that deinit would get called without the HTTPNetworkTransport actually being deallocated - usually the telltale sign of a memory leak is that deinit never gets called at all.

Is this only happening with background sessions, or is it also happening with normal sessions? Or are you using a background configuration for all sessions?

Hi again!

To answer your question, we are using a background configuration for all sessions. As for normal sessions, we do see some memory leaks but they do not impede calls. Maybe the following will be useful.

We tried many configurations, mainly:

  • not using a URLSessionClient
  • using a URLSessionClient with a .background configuration and a callbackQueue .main
  • using a URLSessionClient with a .background configuration and a callbackQueue .some

Only the configuration of .background with a callbackQueue .some is suitable for us. Otherwise, some computations happen on the main thread when Apollo decodes our objects which brings the application to a crawl. (This is probably due to a specific situation in our code).

OUR TESTS

Here are our results with those configurations in both instance setup and singleton setup. The main things we got out of it are that with an instance setup:

  • Calls do not come back if we provide a URLSessionClient
  • We always see URLSessionClients being kept in memory. In instruments, some are identified as leaks, some are not.

We hope this is clear and it did not confuse you to what's happening. As always, we are here if you need more info ๐Ÿ˜ƒ.

No URLSessionClient

Instance

โœ… Calls come back
โœ… Does not retain objects holding ApolloClient instance in our code
๐Ÿ›‘ Retain orphaned UrlSessionClients and shows a leak in instruments

Singleton

โœ… Calls come back
โœ… Does not retain objects holding ApolloClient instance in our code
โœ… Does not retain orphaned UrlSessionClients

URLSessionClient with callback queue .main

Instance

๐Ÿ›‘ Calls do not come back
๐Ÿ›‘ Retains objects holding ApolloClient instance in our code
๐Ÿ›‘ Retains orphaned UrlSessionClients but does not show a leak in instruments

Singleton

โœ… Calls come back
โœ… Does not retain objects holding ApolloClient instance in our code
โœ… Does not retain orphaned UrlSessionClients

URLSessionClient with callback queue .some

Instance

๐Ÿ›‘ Calls do not come back
๐Ÿ›‘ Retains objects holding ApolloClient instance in our code
๐Ÿ›‘ Retains orphaned UrlSessionClients but does not show a leak in instruments

Singleton

โœ… Calls come back
โœ… Does not retain objects holding ApolloClient instance in our code
โœ… Does not retain orphaned UrlSessionClients

I'm going to take a look deeper at this on Monday but one thing to be aware of is that there is no such thing as "No URLSessionClient - a default one will be created for you if you're using HTTPNetworkTransport. It uses .main as its default queue, so I'm not sure what would be different between that and passing in your own.

Additionally: How are you setting up the instances? Is it possible that ARC is clobbering them because they are not properly retained somewhere and one of the assorted weak self calls winds up being nil?

Thanks!

For the URLSessionClient, we figured that one was created on the Apollo framework side if not provided. Seeing the disparity in behaviour, we thought of sending you our observations on that as well. Maybe the difference between providing ours compared to the one by default is that we always put a URLSessionConfiguration.background instead of the .default one provided by Apollo.

As for how we retain instances, if you are talking about the URLSessionClient, we create it and give it to the HTTPNetworkTransport. It is all done in the same method and we do not retain either the URLSessionClient or the HTTPNetworkTransport. If you are talking about the ApolloClient, we tested swapping the ApolloClient for a mock that implements the same methods. We see no retain on our side.

So where do you retain the ApolloClient which you've handed the HTTPNetworkTransport to? Basically, I have a suspicion that the whole stack is getting hammered by ARC before the call returns if the instance is not being held onto somewhere. That would explain why in the Instance case listed above calls don't come back.

Again, if you have some sample code to share, that would be most helpful.

The ApolloClient is retained by a repo object. When the object is instantiated it is injected with a new ApolloClient instance and uses it to make its calls.

The first time a user navigates to where a call is needed, we create a repo and start a call, it comes back correctly. The second time a user navigates to where a call is needed, we create a new repo and start a call, it never comes back. We keep those instances alive until they are not needed anymore or the user navigates away. In any case, when we create the second ApolloClient instance, nothing returns from the call even if held for over 10 minutes.

I was able to confirm that calls not coming back happens if we provide a URLSessionConfiguration that is .background.

Also, sorry about the late notice but in debug environment, if we provide a URLSessionConfiguration that is .background, we often hit line 225 of the URLSessionClient:
assertionFailure("No data found for task \(dataTask.taskIdentifier), cannot append received data")

I was able to replicate easily with the iOSTutorial project. What is the easiest way to send you that project amended to reproduce?

Email! ellen at apollographql dot com.

@EggYoke The sample you sent was very helpful, thank you. The underlying problem was a retain cycle as outlined in #1366.

The secondary problem is that you're attempting to create multiple background sessions with the same identifier at the same time, which the system won't let you do. I found that after the changes in #1366 were applied, I was able to get things to work with a background session by just using UUID().uuidString as the identifier, as this prevents any collisions in session name.

Honestly, I wouldn't really recommend using background sessions for everything as it's kind of overkill, but if you need to for reasons I'm missing, that should at least unblock you.

This has shipped with 0.31.0 - if you're still having issues with that version, please open a new issue and we'll try to figure it out. Thank you!

Thank you very much for the fast response.
Our calls come back successfully and none of our objects are retained anymore.
This fixed all our issues, awesome job! ๐Ÿ‘

Was this page helpful?
0 / 5 - 0 ratings

Related issues

AnthonyMDev picture AnthonyMDev  ยท  4Comments

plm75 picture plm75  ยท  4Comments

Dmurph24 picture Dmurph24  ยท  4Comments

wnagrodzki picture wnagrodzki  ยท  4Comments

marcoreni picture marcoreni  ยท  3Comments