Hi,
since updating the ExoPlayer from 2.4.4 to 2.8.1, we noticed a significant amount of crashes on Samsung Devices with Android 7.0. Unfortunately, we haven't been able to reproduce the crash ourselves - we are aware of the issue only due to Crashlytics reports.
Therefore, this bug report certainly lacks some information, but maybe you might have an idea (or somebody might have encountered a similar problem) anyway. Even some explanation of possible reasons for this crash might help a lot.
From the logs, we know that the crash occurs after stopping a stream (using SimpleExoPlayer.stop()). The Player State becomes idle: onPlayerStateChanged, playWhenReady=true playbackState=1 and the DRM session is released. It seems to happen only for DRM protected streams and it does not happen every time a stream is stopped.
It might be worth noting that we are using a single ExoPlayer instance in several fragments within a ViewPager - users can switch streams by navigating within the ViewPager. However, the crash occurs not only when swiping left/right, but also when the user is stopping the stream in another way (e.g. by sending the app to the background).
I will email the link to dev.[email protected] using a subject in the format "Issue #1234".
2.8.1 - we will update to 2.8.2 soon, but from the release notes, I don't think that this will fix our problem.
Android 7.0
only Samsung devices - most frequent devices:
As mentioned above, I am unable to create a full bug report. This is the stack trace of the Crashlytics Crashes:
Fatal Exception: java.lang.IllegalArgumentException
at com.google.android.exoplayer2.util.Assertions.checkArgument(SourceFile:39)
at com.google.android.exoplayer2.upstream.DefaultAllocator.release(SourceFile:121)
at com.google.android.exoplayer2.source.SampleQueue.clearAllocationNodes(SourceFile:609)
at com.google.android.exoplayer2.source.SampleQueue.reset(SourceFile:111)
at com.google.android.exoplayer2.source.SampleQueue.reset(SourceFile:98)
at com.google.android.exoplayer2.source.chunk.ChunkSampleStream.onLoaderReleased(SourceFile:324)
at com.google.android.exoplayer2.upstream.Loader$ReleaseTask.run(SourceFile:434)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607)
at java.lang.Thread.run(Thread.java:762)
I'm sorry for the sparse information here - hopefully you can support me anyway.
That looks strange. It's also strange that we've not seen this reported by anyone else, unless it's somehow also specific to your application. Do you have any custom components that you're injecting into ExoPlayer, or are you using only components provided by the library?
Thanks for the quick answer! The fact that you haven't heard this from anyone else certainly points towards an issue in our application.
We are using a custom DrmSessionManager. In our application, rapid zapping between different streams (most of them DRM protected with the same init data) is crucial. Therefore, we have implemented a DrmSessionManager which uses a DefaultDrmSessionManager as delegate. When a DRM session with a specific UUID is aquired for the first time, our DrmSessionManager acquires an additional DRM session (using the respective method in DefaultDrmSessionManager). Additionally, we have a method to release all of these artificial DRM sessions which is called in our Activity's onPause(). Hence, the stream switching is a lot faster since the key request is made only for the start of the first stream because the artificial DRM session is kept open throughout the lifecycle of the app.
I'm aware that this is somewhat experimental, but with most devices (except the ones mentioned above), everything works fine.
Apart from that, we recently also implemented the dummy surface workaround discussed in https://github.com/google/ExoPlayer/issues/3835#issuecomment-390646933. However, we saw the crashes before implementing this workaround, so the issue should not be related to those changes.
Do you have any customizations to the allocator? Are you using the standard DashMediaSource without doing anything special to it, or do you have customizations in the source as well. If so, they would be the most likely culprit.
Could you also check whether the reports you're getting are from official Samsung builds (vs some random ROM that a few users have flashed onto their devices)? Some build fingerprints would probably be helpful. Are the crashes just from these four devices?
We have no customizations to the allocator and we are using the standard DashMediaSource:
new DashMediaSource(uri, buildDataSourceFactory(false), new DashManifestParser(), new DefaultDashChunkSource.Factory(mediaDataSourceFactory), DEFAULT_MIN_LOADABLE_RETRY_COUNT, LIVE_PRESENTATION_DELAY_MS, mainHandler, eventLogger);
The crash occurs only on Samsung devices with Android 7.0, but the ones I mentioned above are only the most frequent devices:
Galaxy Tab A 10.1 43%
SM-J530F 8%
SM-J730F 6%
Galaxy J7(2016) 3%
SM-A320FL 1%
SM-J701F 1%
SM-G570F 1%
Everything else seems to be below 1%.
I am quite sure that the crashes are from official builds because
Anyway, I will try to get some build fingerprints - this might be helpful for other issues as well. However, it will take some time until the next app version is released to the general public, so I will most likely get build fingerprint information in about 4 weeks.
What do the percentages mean? Percentage of what? Thanks!
Sorry, should have mentioned that.
The percentages refer to the total number of crashes we had. We had ~500 crashes (~300 users) in the last 30 days, of which 100% were on Samsung devices with Android 7.0. Of those 500 crashes, 43% happened on the Galaxy Tab A etc.
Can you derive an (approximate) percentage of playbacks that result in the crash on the affected devices? In particular, do you think this crash occurs for every playback on the affected devices, or sporadically?
You say you've been unable to reproduce the crash yourselves. Is this because you don't have one of the affected devices, or is it the case that you do have one of the affected devices, and are still unable to reproduce? If the latter, have you tried to reproduce with your proper release apk (e.g. install from Play Store, as your end users would), rather than just with a debug apk?
Unfortunately, I cannot derive a percentage of playbacks that result in the crash, but in the logs, I saw that not every playback leads to a crash on the affected devices. For some users, I have seen several "zapping" events (= playback stop + playback start of another stream) before the crash occurred.
I haven't been able to reproduce the crash because I do not have an affected device. I'm trying to get a Galaxy Tab A 10.1, but buying it (or using a "real device testing" service) needs approval, of course...
Judging by the numbers, I'm optimistic that I will be able to reproduce the crash on the device.
Good point regarding the release APK! I usually try the APK from the store first before installing the debug version in order to dig deeper.
I know that exception since 1.5.8 version. In old ExoPlayer i changed ASSERTIONS_ENABLED to false to get rid of crash but now I'm using 2.8.2 and same problem still appeared on Samsung Devices with OS 7.0. ASSERTIONS_ENABLED is public static final in ExoPlayerLibraryInfo and can't modified this time cause I'm using 2.8.2 with gradle implementation.
Device model ratio is changing according to popularity for application. For us 9 different Samsung devices with OS 7.0 facing with this exception. I couldn't reproduce crash yet but in a week around 90 crashes happened even %10 of total play session is using 2.8.2 on our app. I'm worried about that our app with real time play session over 20K how it will be affected.
By the way on my experience it's not related with neither media is protected with DRM or not nor content type is HLS or Dash. On my Crashlytics reports I set some keys about my contents and all possible combinations are available(Dash and HLS with or not DRM).
Here is my rendererFactory, allocator, loadControl and playerInstance;
DefaultRenderersFactory rendererFactory = new DefaultRenderersFactory(this, DefaultRenderersFactory.EXTENSION_RENDERER_MODE_OFF);
DefaultAllocator allocator = new DefaultAllocator(true, C.DEFAULT_BUFFER_SEGMENT_SIZE);
DefaultLoadControl loadControl = new DefaultLoadControl.Builder()
.setAllocator(allocator)
.setBufferDurationsMs(360000, 600000, 2500, 500)
.setTargetBufferBytes(C.LENGTH_UNSET)
.setPrioritizeTimeOverSizeThresholds(true)
.createDefaultLoadControl();
player = ExoPlayerFactory.newSimpleInstance(rendererFactory, trackSelector, loadControl, drmSessionManager);
If you need any information by me let me know.
@OMArikan Thanks for your input! It's good to know that the problem occurs for others as well.
@OMArikan Please can you provide some sample stack traces, so we can double check it's exactly the same issue.
@ojw28 I sent 3 sample stack traces files from crashlytics to dev.[email protected] with subject "Issue #4532". Also lately sent screen shot from Dashboard of Crashlytics to see only Samsung and OS 7.0
Thanks. If either of you manage to reproduce the issue locally, it would be very useful to know if it reproduces only with your release build (but not with your dev build), and whether enabling/disabling ProGuard makes a difference. I suspect this might be a strange VM/ProGuard issue, but that's just a guess.
I tried many times to reproduce issue locally with same devices but never catch. I don't believe it'd be related with my app ProGuard if it's not related with 3rd party SDKs such as Retrofit, okHttp, Fabric, Adjust.
I just wanted to pitch in and say that we've seen this issue in our app as well. I've got nothing new that I can think of to add to this, will be sending some stack traces to the gmail address.
The crash occurs on a very small percent of the playbacks and we have not been able to reproduce it ourselves (all though we have not made much effort using the production/proguarded build).
These are the devices that show up in our crash logs:
SM-T580 (NRD90M.T580XXU3BRF1)
SM-T585 (NRD90M.T585XXU3BRF1)
SM-J530F (NRD90M.J530FXXU2ARC3)
One thing, and I'm just spitballing here:
From my understanding, we have a similar UX as @jschamburger the user can switch streams using a viewpager, however we only use one fragment with one PlayerView and then we create a new instance of SimpleExoPlayer for each new stream, binding it to that single view.
What I wanted to conclude is that we could be releasing and creating new instances rapidly...
Interesting, although intuitively I wouldn't ever expect the design of the UI to result in a seemingly device specific assertion in this part of the player. As an initial step, what we'll do is add more information to the exception that's thrown (in the exception message). If this is a random VM/Proguard issue, then it's possible that doing this might just make the issue go away. If it doesn't go away, we'll at least have a little bit more information about the failure.
Not like @nicklasl and @jschamburger integrations I don't have any pager UI. SimpleExoPlayer implemented as demo app, releasing and preparing in each actions. Also I don't use PlayerView in my app, to customize everything I use my SurfaceView and custom controller with anchor methodology like 1.5.0 version. Letting us you know that maybe helps you to find out different thing.
Also we already reached 630 users with 756 crashes related with this issue.
What percentage of your total active users is 630? Absolute numbers really don't tell us anything about how prevalent this issue is.
For us, approximately 0.16% of the users had the crash, half of them on the Galaxy Tab A 10.1.
Thanks. There's a Galaxy Tab A 10.1 in the office, so I'll try and get hold of it next week.
In the meantime, we've cherry-picked the more informative exception message into 2.8.4, which will be released soon. Once someone has captured a failure with this release, please share the message that you see! Note that the change will result in a slightly different stack trace (it'll be the same down to DefaultAllocator.release, but will no longer have the Assertions.checkArgument line).
Hopefully you'll manage to reproduce it with this device - we got one as well (SM-T580) in the meantime, but weren't lucky so far.
@ojw28 since the releases of my app have exoPlayer 2.8.2;
645K Total Users
415K Samsung Users
189K Samsung 7.0 Users
650 Users faced with crash due to Assertions.checkArgument
Ratio of crash for that user specification : 0.34%
SM-G610F %43 of these devices
Also I realized this crash is not appearing on GA/Behavior/Crashes and Exceptions.
As soon as 2.8.4 will be released I'll update my app fast and inform you on here.
I tested on a Galaxy Tab A 10.1, but was unable to reproduce the issue. Would you mind linking to your application(s) on Play Store? I'll try installing them to see if I can reproduce with those.
@ojw28 BluTV. However it built with ExoPlayer 2.8.2, newest version 2.8.4 already integrated to application but not distributed yet. You can follow last update date to realize latest version later next week.
@ojw28 waipu.tv Current version still uses ExoPlayer 2.8.2 - the upcoming version 3.7.0 will use ExoPlayer 2.8.4 (will be released in ~ 2 weeks).
Unfortunately, the app is only usable inside the EU - in other countries, the streams will not start. In case you are outside the EU, please let me know, so we can try to create a working setup for testing.
We just released Com Hem Play v4.3.5 which has ExoPlayer 2.8.4. No crashes reported yet though.
How frequently were you seeing reports before (i.e. how long would you have to wait before you'd expect to see a crash report, if the problem still remains)?
@ojw28 I would suspect we will see a crash before weeks end. I'll post here when I have more information.
@ojw28, got the first crash:
Android: 7.0
Android Build: NRD90M.T580XXU3BRF1
Manufacturer: samsung
Model: SM-T580
Thread: Loader:ChunkSampleStream-13144
CrashReporter Key: F05C6376-3715-4373-8165-A4B1A2A44A853D08E23C
Start Date: Tue Sep 04 15:10:00 GMT+02:00 2018
Date: Tue Sep 04 17:21:44 GMT+02:00 2018
java.lang.IllegalArgumentException: Unexpected allocation: 12719672, 0, 0, 65536
at com.google.android.exoplayer2.upstream.DefaultAllocator.release(DefaultAllocator.java)
at com.google.android.exoplayer2.source.SampleQueue.clearAllocationNodes(SampleQueue.java)
at com.google.android.exoplayer2.source.SampleQueue.reset(SampleQueue.java)
at com.google.android.exoplayer2.source.SampleQueue.reset(SampleQueue.java)
at com.google.android.exoplayer2.source.chunk.ChunkSampleStream.onLoaderReleased(ChunkSampleStream.java)
at com.google.android.exoplayer2.upstream.Loader$ReleaseTask.run(Loader.java)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607)
at java.lang.Thread.run(Thread.java:762)
Thanks! This shows that the check is failing because allocation.data.length == 0, where-as we expect it to be 65536. I still think this is a platform/VM issue, probably at a fairly low level, because:
Allocation, here and here. This is easy to verify if you have the project checked out in Android Studio.IllegalArgumentException logging that an initial allocation hasn't been made.individualAllocationSize, which must be greater than 0 due to this assertion.Allocation, it must be true that data.length > 0. This implies that data.length evaluating to 0 at the point where the IllegalArgumentException is thrown is incorrect.I'm discussing next steps with the Android VM team, but without a way to reproduce the problem it might be a little tricky to figure out exactly what's going wrong. If it is a VM issue, it's likely caused by something Samsung have done specifically for their devices, because this thread implies it doesn't reproduce anywhere else.
Took a while for us to release the version with ExoPlayer 2.8.4, but now I have seen several occurrences of the crash. The values differ quite a bit - the length is sometimes 0, but quite often it is a large negative number:
0, 0, 0, 65536
200894819, 0, 0, 65536
93042059, 0, -1249135221, 65536
8224893, 0, -2139258755, 65536
111515010, 0, -1549490778, 65536
45975423, 0, 1, 65536
8421504, 0, -2139062144, 65536
This is definitely a platform/VM bug then. I don't think array.length can ever be negative if the VM is functioning correctly, right?
For those encountering this issue: Do you have any native code in your app? It's one theory that:
The bad length is presumably in a large object allocation, which could be preceded by native memory, and thus also could conceivably have been scribbled over by someone else?
We do not have any native code in the project.
@ojw28 , I have just replied with new stacktraces to old e-mail with issue number.
Crash number is fewer than old but can't say crash ratio is decreased according to playing/crashes ratio.
And yes I have some native codes but they are just constants to keep secrets really secret :)
I'm totally agreed to you about device specific problem related with VM. However what if we could change ASSERTIONS_ENABLED to false, is it gonna be problem for further steps or not ? When I was using 1.5.8 always changed that to false and never seen any problem with player. Either opening issue to Samsung or letting change value of ASSERTIONS_ENABLED could help us fast.
Is it possible that developers evaluate the risk of changing it to not to final?
When I was using 1.5.8 always changed that to false and never seen any problem with player.
A question I'd ask about this, is how closely were you looking? If playback failed but the app didn't crash, would you have noticed?
Why do you have AnalyticsListener or letting listen PAYER_STATE to developers ?
Changing it to false is on developers responsibility and they could handle all failures otherwise it's their problem not ExoPlayer.
Since 1.5.8 all events on EventLogger I'm sending to GoogleAnalytics and our bigData side to analysis problems and fixed some of them with our head-end side which are related with encoder & transcoder or manifest standards. Also some DRM errors that we faced on our app related with devices already sent to Google and they found problem on their side. This errors neither disappear with ASSERTIONS_ENABLED nor decreased. For us only change that we realize is this crash on Samsung 7.0 devices. Because of this I asked that do you know or realize any other risk to disable it. Don't you have any Test Automation to compare differences of ASSERTIONS_ENABLED true or false?
Sorry, but I don't understand what the post above is trying to say. Please try and rephrase if possible.
We can watch all of the error logs via AnalyticsListener and DefaultEventlistener. If a developer disables ASSERTIONS_ENABLED property from the code, he/she has to have the responsibility of the solution.
Since 1.5.8 version, i'm storing all of the EventLogger logs and checking. Since now wtih these logs we just learnt about some DRM and transcoding problems. For DRM problem we informed Google and they accepted the problem of Widevine. In that period ASSERTIONS_ENABLED was always false. After 2.xx upgrade i started to use Exoplayer from maven and i couldn't change ASSERTIONS_ENABLED anymore and the only problem with this the Samsung 7.0 problem.
If you have some information about this problem and if you can share with me, that would be good. Do you have any test result with ASSERTIONS_ENABLED false or true usage.
If you can allow developers to change this property it would be helpful. I would prefer to serve lower quality than the application crash, i can fix and understand issues better with this way.
Regarding ASSERTIONS_ENABLED: We only assert things that should always be true. In most cases disabling an assertion will result in a subsequent failure shortly thereafter, which is often much harder to debug (e.g. because it occurs on a different thread to where an API was originally misused). Hence it's clearly preferable to leave them enabled.
For the particular assertion being discussed here, it's pretty unclear to me whether a subsequent failure would occur. I think we should probably just make sure we're catching it and propagating it as a playback error, rather than crashing the process.
Actually, we're just going to remove the assertion in question. It's not a particularly useful one. What happens to playbacks on the affected devices once the assertion is removed, we don't really know.
Removal of the assertion will be in the next release.
Most helpful comment
Removal of the assertion will be in the next release.