So while Firestore supports offline persistence (which is great), it seems to favor an online-first paradigm.
In order to increase app responsiveness, I suggest that the js database first emits local data, and then remote changes (if stale).
The problem we're experiencing currently, is our (angular) app basically feels sluggish, because of all the waiting for data. Chrome's network inspector seems to confirm that.
I hope I'm understanding how it currently works; the documentation on the offline layer is pretty brief.
Related issue: https://github.com/angular/angularfire2/issues/1243
As an addendum, would it be possible to enable the offline only mode I see in the source code? I see that there are enableNetwork/disableNetwork in the source, but I don't think they are publicly exposed.
Yesterday I had some issues with firebase connectivity. It was ridiculously slow, I couldn't navigate my application at all due to the fact that it would take up to 2 minutes to pull data. Having offline first would be really nice in these kind of situations.
I contacted firebase support and they said it's was some sort of query bug. Apparently doing write operations resolves it.
What is the status of this issue? I see this as a critical feature for all the hybrid developers who would like to build really independent applications.
Please let us at least briefly know what you think about this.
@larssn Can you please provide a repro or more details about what exactly you're seeing? When you register an onSnapshot() handler, Firestore should immediately raise an event with any cached data and then raise a subsequent event once we've gotten data from the server.
@mikelehen Sorry for the late reply.
I think onSnapshot is working as intended. Maybe we should convert this issue into a general performance discussion.
A few observations regarding performance:
@larssn Thanks for following up!
It is true that if there are no cached results for a query, onSnapshot() won't fire with an empty snapshot until it hears back from the server (or determines that the client is offline and so can't get data from the server). This is because we believe it'd be confusing to immediately give you an empty snapshot every time you do a new query you haven't done before (which will likely have no cached results).
The assumption is that you should mostly only hit this for "cold scenarios" (where you've never done the query before). Once a user has used your app a bit, all of the queries should have cached results and therefore get instant results. If that isn't the case for your app, I'd be curious to know more about your use case.
We also assume that if there are no server results anyway, there won't be a huge UX difference between us immediately giving you an empty snapshot, and us waiting a small period of time (e.g. 1 second) to check with the server before giving you the empty snapshot. If that doesn't make sense for your app, again I'd be curious to hear more about your use case and UX design.
As for the start difference in performance with network turned off, again I'd like to hear more. If you're using onSnapshot() listeners, then it shouldn't make any difference (except for empty snapshots, which are covered above). If you're using get(), then you will indeed see a difference, but this is necessarily true because get() can't give you out-of-date cached data, since it has no way to update you once fresh data arrives. We're looking into exposing options on get() so you can e.g. opt-into just reading the data from cache, but I don't expect this will solve your use case since I assume you do want your app to get up-to-date data if it can.
FYI- I'm fine with having a bit of performance discussion here, but anything concrete to be addressed will need to go into a new issue. This one has mutated several times since being opened, making for an unclear history.
@malcolmdeck I would love to show you my app with the difference. Can't really share the code publicly, but if the network is spotty, startup takes a year. If I am offline, it's super fast.
Also when on good network, eg fast wifi, startup takes significantly longer when offline is enabled.
If you want I am more than happy to do a hangout or run some benchmarks for you.
Hi all - I've been pondering with firebase how to provide a good user experience in the case users may be offline/online but updating the same documents, so trying to understand how to avoid someone making changes whilst offline, staying offline for several days, then when coming back offline overwriting more recent changes others may have made. I had a response in the forum here https://groups.google.com/forum/m/#!topic/google-cloud-firestore-discuss/9hkRhsLvU-I on this,.
Wondering how this issue (user experience re conflict resolution) relates/crosses-over with the offline discussion you are having?
@izakfilmalter It would be helpful if you could be more specific about what exactly is taking longer and provide actual timing information rather than "a year." :-)
Your comment about startup taking forever on spotty networks makes me wonder if you are doing get() calls. We implement get() to get up-to-date data from the backend when online, else fall back to cache. So that could explain the issue and could likely be worked around by using onSnapshot() instead. We're also reworking some timeouts in the future which could help.
As for startup taking longer with offline enabled, there's some known performance work in the pipeline. In particular, as you build up a large number of offline documents, even small queries may become slower as we do not yet implement indexing in the clients, so it has to scan all documents in the collection you're querying against. So that may explain what you're seeing.
We implement get() to get up-to-date data from the backend when online, else fall back to cache.
@mikelehen When you say fallback to cache when not offline, could you please tell on what condition fallback happens? From what I observe, even when I am offline, a get call sends multiple network requests to fetch data before returning cached data. It seems it doesn't immediately detect offline status (eg. using navigator.onLine) to fallback.
Fallback to cache happens if the client has decided that it is offline. The client will decide it's offline if it has failed to reach the backend at least twice.
In general, "offline" is a fuzzy state. You might literally have no network connection at all. Or you might be in an elevator. Or you might be on a really slow 3G network. All of these will appear slightly differently.
To further this discussion, it would be very helpful to get more details on what state your network is in (are you disabling the network 100%? Or on a broken / slow network? etc.) and what sort of delays you are seeing.
Hey Guys, i am using Firestore in a React App, with persistence enabled. So persistence is working fine and the application ist damn fast if u are fully offline or you got crazy fast internet. The thing now is that if you got slow internet, the onSnapshot is like loading forever. I would be awsome if onSnapshot would fire first with cached data and after getting in sync with the online data, it would fire again.
@mono424 That should be how it works already. Thats how it works for us:
2 emissions: 1 fairly quick from indexdb, and another when the server replies
@larssn My bad, offline first is working like charm! Cheers! Firestore is no joke :p
@mikelehen
About falling back to catch when using Get() on offline mode;; it's never falling back on my mac laptop.. using chrome and latest SDK 4.9.1,
without enablePersistence -> i disable the wifi and try Get->
console will show many error for connection but never falling back to catch.
with enablePersistence -> i disable the wifi and try Get ->
-if i get before while iam online-> i will get the previous data cached on indexedDB
-if i didnt get it before while iam online -> console will show many connection errors and will get data when back online but never falling back to catch.
i was going to open a ticket about it, but just read your comments mention this
thanks
@mohshraim That doesn't sound like the expected behavior. Can you open a new issue with a code snippet and details on what version of the SDK you're using?
Thanks!
-Michael
Just stumbled upon this thread while debugging multi-second delays in firstore's get queries while offline. I couldn't just blindly switch everything to onSnapshot in my app because i don't actually want some values to be continuously updated in real time after the initial page load. Instead, I wanted the fairly standard offline first strategy of cache-first a.k.a. cache then network. That is, a single request to the cache and a single request to the network.
I wrote this little helper function to replace my get calls. Dropping it here in the hope that someone else will find it useful.
function cacheFirstGet(ref, handlerFunc, timeout = 10000) {
let options = {};
if (ref instanceof firebase.firestore.DocumentReference) {
options.includeMetadataChanges = true;
} else if (ref instanceof firebase.firestore.Query) {
options.includeQueryMetadataChanges = true;
} else {
throw new Error("Input must implement DocumentReference or Query");
}
let cacheIsLoaded = false;
let unsubscribe = ref.onSnapshot(options, function(snapshot) {
if (!snapshot.metadata.fromCache) {
// when we recieve a response from the network we can stop listening
unsubscribe();
handlerFunc(snapshot);
return;
}
if (!cacheIsLoaded) {
cacheIsLoaded = true;
handlerFunc(snapshot);
}
});
// handles unsubscribing when the network never responds
window.setTimeout(unsubscribe, timeout);
};
Considering that the PWA checklist says to "Use cache-first responses wherever possible," I was surprised that there wasn't an easier way to do this in firebase. First class support for this important paradigm would be awesome. Something along the lines of onSnapshot({cacheFirst: true}, handlerFunc) or preferably get(handlerFunc) could just have this behavior by default if that isn't too confusing.
P.S. If there's a better way than a timeout to determine that the network request failed, or any other suggestions/improvements, fork or comment on the gist and I'll update the copy in this comment.
Maybe there's something I didn't understand but why do we talk about "fallback to cache" if offline. The idea of offline first is to serve cache datas first regardless of your network status, right ? We use observable, can't we just return cache data first then remote data as soon as we get them ?
@mono424 You say it now works like a charm for you ? Did you change something to make it work ?
Sorry if I've missed something
@mparpaillon for me its working now exactly like you sad. no matter if i am online or offline, the cached data is serverd first. It failed first cuz i implemented it wrong, i used .get() instead of an 'onSnapshot'-handler. With using onSnapshot() all working fine for me now, hope this helps you in any way :)
@zevdg @mparpaillon @mono424
https://github.com/firebase/firebase-js-sdk/pull/463 in this pull when its done you can get control for your request even from cache or server..
@zevdg I'd curious if you could elaborate on "I couldn't just blindly switch everything to onSnapshot in my app because i don't actually want some values to be continuously updated in real time after the initial page load."
It sounds like your app can tolerate 1 "realtime" update after the initial (cache) load as long as it's within 10 seconds, but you don't want additional updates beyond that? Could you share a little about your app and intended user experience? Our premise is that generally just embracing realtime updates (even beyond 10 seconds) will result in a better user experience and simplify your code (especially if you're already writing it to accept multiple calls to your handlerFunc, as you are).
That said, if your UX requires something else, wrapping onSnapshot() as you have done is totally reasonable and likely the way to go.
Sure thing. My use case is a type of social network. By that, I mean there is a feed that shows a list of many posts, and those posts can be drilled into to show more detail. Both the feed and post detail page benefit from a cache-first strategy.
For a feed, blindly loading every new post as it is made would be disruptive at high volumes. The strategy of only loading once at page load makes sense and is used by reddit and stack overflow. The most "real-time" you would want this to be is what twitter and facebook do where content is loaded in the background but not shown, and the user is presented with a "Show 5 new posts" button to reveal them.
For a post detail's page, imagine the number of likes on a post. For a small number of users, updating the like count in real time would be a good user experience. However, imagine doing that on a popular tweet. The number would be constantly changing. This would be visually disruptive and also very expensive. The most "real-time" you would ever want this to be would be periodically updating the number on a timer (could be implemented either via push or pull).
So in conclusion, for both of these cases:
I'm building an MVP right now, so simple and good is better than complex and ideal. However, I still want a good offline experience, so cache-first gets are what I want.
Edit: For the curious, my MVP is at appleseed.vote. (You have to log in with your google account at the bottom of the page.)
Thanks @zevdg. That's very well explained and all makes perfect sense. I agree with your assessments of ideal behavior vs. simplicity, etc. as well. :-)
FWIW, for both "full real-time" and "pseudo real-time" you should hopefully be able to implement them via onSnapshot(), either directly or via a tiny wrapper (e.g. to store the latest snapshot but not display it until the user clicks "Show 5 new posts").
So the "loading data once on page load" is the interesting case here and is what get() is meant to solve, but it sounds like you're actually wanting to "load data twice on page load" (first from cache, then from server) which is also a reasonable UX (and can be achieved via onSnapshot as you've done), but not something we've specifically optimized for...
I don't think it'd be possible to have get() have 2 results (we'd have to return 2 promises or something). So I think if we were to add this to our API, it'd probably be an onSnapshot() option to automatically unsubscribe after the first non-cache result, e.g. onSnapshot({ until: 'from-server', timeout: 10000 }, snap => ...) Does that sound right?
Anyway, we'll keep this in mind for future improvements. Thanks for explaining the use case.
Yeah, that API would certainly do the trick. However it is a bit verbose and, until: 'from-server' implies that that if the cache were to update several times before the network response returns, the callback would get fired on each of those updates. The standard cache first pattern implies at most 2 responses.
More importantly, from an application developer's mindset, cache-first is much more a variant of get than it is of onSnapshot. When converting an existing application to be offline-first, or when writing a new one, you'd use a cache-first get in exactly the places you would have used a normal get in an online-first app. Furthermore, once #463 lands, the saner implementation of my cache-first helper function would be as 2 get calls. Even the timeout option is weirdly out of place in onSnapshot while it would make perfect sense as an option to get. All this makes me suspect that overloading get would be more intuitive to users than overloading onSnapshot
You _could_ return multiple promises from get, but I agree, that would be weird. What I was suggesting was that there would be 2 ways to call get: with or without a callback (like once from the RTDB). If you passed a callback function into get instead of using the promise, then it would have cache-first behavior by default. Those who prefer online first behavior would continue to use the promise api. In code, this would look like
// current API, would continue to work as-is
ref.get().then(snapshot=> {
// normal online get with fallback to cache, only called once
console.log("data", snapshot.data())
});
// my proposed alternative invocation
ref.get(snapshot=> {
// cache-first get, may get called up to 2 times
console.log("is-from-cache", snapshot.metadata.fromCache)
console.log("data", snapshot.data())
});
This would be the simplest API in my opinion. It could certainly cause confusion as people might expect these two invocations to have 100% identical behavior. That said, if offline-first becomes an industry standard best practice, then you'll want to make using this pattern feel as natural as possible.
This variation might be slightly less confusing alongside the existing API. It's more verbose but also nicely explicit
ref.get(snapshot=> console.log("cache-only response", snapshot.data()))
.then(snapshot=> console.log("network-only response", snapshot.data()))
With any of these proposals, make sure you document what happens if the network responds before the cache. Probably, it would be best not to fire the callback for the cache response in this case.
Finally, instead of overloading get or onSnapshot, you could just make a new function called load or something that does a cache-first get. That might be the best option if you want to enable this pattern without burying it as an option or causing confusion.
Sorry about the novel. :sweat_smile:
Hey all, I'm closing this issue as it contains a collection of miscellaneous bug reports and suggestions and I'm not sure what, if anything, is still relevant and actionable. If you have a specific bug or feature request related to Firestore's offline persistence or use of the cache, please open a new issue with details and we'll be happy to investigate. (And my apologies if anything fell through the cracks with this issue)
Most helpful comment
Yesterday I had some issues with firebase connectivity. It was ridiculously slow, I couldn't navigate my application at all due to the fact that it would take up to 2 minutes to pull data. Having offline first would be really nice in these kind of situations.
I contacted firebase support and they said it's was some sort of query bug. Apparently doing write operations resolves it.