Nightly 200913 06:02 (Build #2015763635)AC: 59.0.20200911130559,d974933d4GV: 82.0a1-20200908095243AS: 61.0.13I can reproduce this pretty reliably, so captured some adb logs. It looks like Fenix does receive a push notification about the first tab, but is unable to process it:
09-15 10:26:18.305 20686 20770 E App : Native push layer is not yet initialized.
09-15 10:26:18.308 20686 20770 I [INFO][Leanplum]: [com.leanplum.LeanplumPushFirebaseMessagingService::onMessageReceived::4]: Received: {con=aesgcm, enc=salt=qNfO_3zmyYve9BBC7Xb0eg....[crypto blah-blah elided]
Whereas when I send a second tab, I see it handling the push message and triggering FxA to fetch new sent tabs (of which it finds two, because the first one wasn't fetched above):
09-15 10:28:27.182 22508 22568 I AutoPushFeature: New push message decrypted.
09-15 10:28:27.182 22508 22749 I [INFO][Leanplum]: [com.leanplum.LeanplumPushFirebaseMessagingService::onMessageReceived::4]: Received: {con=aesgcm, enc=salt=WOygR2YokK-...[blah blah base64 blah blah]
09-15 10:28:27.182 22508 22568 D push_ffi: push_dispatch_info_for_chid
09-15 10:28:27.192 22508 22568 D push_ffi: push_decrypt
09-15 10:28:27.212 22508 22568 I AutoPushObserver: Received new push message for fxa_push_scope_c0934fc832ed45caa6afc5d9f0633d35
09-15 10:28:27.214 22508 22557 I FxaDeviceConstellation: Executing: processing raw commands
09-15 10:28:27.216 22508 22557 D fxaclient_ffi: fxa_handle_push_message
[...]
09-15 10:28:27.453 22508 22557 I fxa_client::device: Handling 2 messages
@rfk that's interesting! Do you have any more logs I could ready from the first failed attempt? The "push layer not initiailized" is a bit generic and I'm curious what the call was that lead to that error (alternatively I'll try to reproduce this later).
This might even be the cause of a recently filed issue as well: https://github.com/mozilla-mobile/fenix/issues/12351
Do you have any more logs I could ready from the first failed attempt?
Yep, I'll shoot you an email just in case of PII (although I'm not too worried about it in practice).
This logs are super interesting, in that we can see Leanplum (of all things!) working and even logs "Firebase Messaging is setup correctly."
I have a suspicion that this is related to an old bug that Grisha had seen in Fennec and how we setup push with GCM/FCM: we can't rely on the Firebase service being initialized with _our constructor_ where the native layer is init'd. It probably does something like using the default constructor. So when a push message comes in, it init's our firebase receiver with the default constructor (no push layer setup), then invokes onMessageReceived which fails immediately.
I'm going to pick this up next week to verify this theory by initializing the native layer if we're not already at the message receiving part.
I needed to make this bug reproducible before I could look into a fix. I setup a sample app that integrated with Firebase Push Messaging and showed notifications on the screen.
As your can see in the logs below, the AutoPushFeature is initialized after reboot automatically by the Firebase service (so that's good!) although
$ pidcat | grep push
UsbDeviceManager D push notification:Charging this device via USB
GservicesProvider I Gservices pushing to system: true; secure/global: true
MediaSessionRecord V Removing dead callback in pushSessionDestroyed, this=au.com.sh
tamodel.workqueue.WorkQueueWorkerShim, expire_wap_push_si_mess
whatsapp:push' type='get' to='s.whatsapp.net'><config/></iq>
PushConfig D Creating push configuration for autopush.
AutoPushFeature I Checking validity of push subscriptions.
RustNativeSupport D findMegazordLibraryName(push, 63.0.0
PushConfig D Creating push configuration for autopush.
AutoPushFeature I Checking validity of push subscriptions.
RustNativeSupport D findMegazordLibraryName(push, 63.0.0
AutoPushFeature I Checking validity of push subscriptions.
RustNativeSupport D findMegazordLibraryName(push, 63.0.0
AutoPushFeature I Checking validity of push subscriptions.
RustNativeSupport D findMegazordLibraryName(push, 61.0.13
istry push
istry push
MediaSessionRecord V Removing dead callback in pushSessionDestroyed, this=com.skybl
AutoPushFeature I New push message decrypted.
push_ffi D push_dispatch_info_for_chid
push_ffi D push_decrypt
push::crypto D No sub-value found for salt
wNzcwMDQmbGltaXQ9MSJ9fQ","scope":"fxa_push_scope_8608e2f71cb54
fxaclient_ffi D fxa_handle_push_message
AutoPushFeature I New push message decrypted.
push_ffi D push_dispatch_info_for_chid
D push_decrypt
push::crypto D No sub-value found for salt
wNzcxNjkmbGltaXQ9MSJ9fQ","scope":"fxa_push_scope_8608e2f71cb54
AutoPushObserver I Received new push message for fxa_push_scope_8608e2f71cb547cf9
fxaclient_ffi D fxa_handle_push_message
I ran a similar test with Fenix:
$ pidcat | grep push
UsbDeviceManager D push notification:Charging this device via USB
GservicesProvider I Gservices pushing to system: true; secure/global: true
MediaSessionRecord V Removing dead callback in pushSessionDestroyed, this=au.com.sh
tamodel.workqueue.WorkQueueWorkerShim, expire_wap_push_si_mess
whatsapp:push' type='get' to='s.whatsapp.net'><config/></iq>
PushConfig D Creating push configuration for autopush.
AutoPushFeature I Checking validity of push subscriptions.
RustNativeSupport D findMegazordLibraryName(push, 63.0.0
PushConfig D Creating push configuration for autopush.
AutoPushFeature I Checking validity of push subscriptions.
RustNativeSupport D findMegazordLibraryName(push, 63.0.0
AutoPushFeature I Checking validity of push subscriptions.
RustNativeSupport D findMegazordLibraryName(push, 63.0.0
AutoPushFeature I Checking validity of push subscriptions.
RustNativeSupport D findMegazordLibraryName(push, 61.0.13
istry push
MediaSessionRecord V Removing dead callback in pushSessionDestroyed, this=com.skybl
AutoPushFeature I New push message decrypted.
push_ffi D push_dispatch_info_for_chid
push_ffi D push_decrypt
push::crypto D No sub-value found for salt
wNzgzMjAmbGltaXQ9MSJ9fQ","scope":"fxa_push_scope_8608e2f71cb54
fxaclient_ffi D fxa_handle_push_message
That said, I'm certain this is not indicative for all devices and Android flavours. To move this bug forward to get more information, I'm going to add extra logging to AC/Fenix and may ask to reproduce the bug with that logging enabled so we can verify if my proposed fix above will actually work.
09-15 08:49:13.813 11549 11575 E GeckoLibLoad: Load nss start
09-15 08:49:13.813 11549 11575 E GeckoLibLoad: Load nss done
09-15 08:49:13.848 11549 11549 D PushConfig: Creating push configuration for autopush.
09-15 08:49:13.881 11549 11549 I App : AutoPushFeature is configured, initializing it...
09-15 08:49:13.883 11549 11549 I AutoPushFeature: Checking validity of push subscriptions.
09-15 08:49:13.890 11549 11605 D AutoPushFeature: Initializing native component with the cached token.
09-15 08:49:13.895 11549 11605 D RustNativeSupport: findMegazordLibraryName(push, 61.0.13
09-15 08:49:13.895 11549 11605 D RustNativeSupport: lib in use: none
09-15 08:49:13.895 11549 11605 D RustNativeSupport: lib configured: megazord
09-15 08:49:13.895 11549 11605 D RustNativeSupport: lib version configured: 61.0.13
09-15 08:49:13.895 11549 11605 D RustNativeSupport: settled on megazord
09-15 08:49:13.923 11549 11590 W [WARNING][Leanplum]: [com.leanplum.internal.ActionManager::getLocationManager::8]: Geofencing support requires leanplum-location module and Google Play Services v8.1 and higher.
09-15 08:49:13.923 11549 11590 W [WARNING][Leanplum]: Add this to your build.gradle file:
09-15 08:49:13.923 11549 11590 W [WARNING][Leanplum]: implementation 'com.google.android.gms:play-services-location:8.3.0+'
09-15 08:49:13.923 11549 11590 W [WARNING][Leanplum]: implementation 'com.leanplum:leanplum-location:+'
09-15 08:49:13.953 11549 11577 D WebExtensionsMiddleware: No engine session for new active tab (27246634-1c13-40c0-bcdc-d7516af9968f)
09-15 08:49:13.966 11549 11590 I [INFO][Leanplum]: [com.leanplum.monitoring.ExceptionHandler::setContext::6]: LeanplumExceptionHandler could not initialize Exception Reporting.This is expected if you have not included the leanplum-monitoring module
09-15 08:49:14.023 11549 11613 E App : Native push layer is not yet initialized.
09-15 08:49:14.031 11549 11613 I [INFO][Leanplum]: [com.leanplum.LeanplumPushFirebaseMessagingService::onMessageReceived::4]: Received: {con=aesgcm,
I just noticed from the original logs that:
tid for the initialized component is different from the tid where we get the message.This is beginning to look like an async bug and we're not using our singleton correctly.
My suspicion has now moved to the FirebasePushService in Fenix instead, where we setup the daisy-chaining of Leanplum with AutoPush observers.
The FirebaseMessagingService invokes the onMessageReceived in it's own thread - we know this. So when our singleton is accessed, from the AbstractFirebasePushService, we should by in a ready state to decrypt messages because the logs say we are initialized..
@grigoryk keenly noticed that the isInitialized call check was not wrapped in a synchronized-block like the rest which may be the cause!
I'll put up a fix for this with https://github.com/mozilla-mobile/android-components/issues/8531 and spin a new A-C nightly to ask @rfk if he can reproduce this with that fix.
@jonalmeida is this on nightly yet?
Not yet, no. We had build issues for a few days, I'm going to wait for the next Nighty and verify the git SHA in AC has our change in the history.
I double checked and I see the current Fenix Nightly is still a week old version of AC.
A new nightly is out! @rfk if you could re-test with the latest Fenix (debug variant of master works too), that would be grand. 🙂
@rfk if you could re-test with the latest Fenix (debug variant of master works too), that would be grand
Thanks, looks like it's now working correctly for me! As you mention above, I had to wait a couple of minutes for things to start up, but the tab arrived without any intervention on my part.
Thanks to @grigoryk for finding the bug!
Most helpful comment
Thanks, looks like it's now working correctly for me! As you mention above, I had to wait a couple of minutes for things to start up, but the tab arrived without any intervention on my part.