Exoplayer: Add option to enable exact (but inefficient) seeking into variable bitrate MP3s

Created on 19 Dec 2019  路  14Comments  路  Source: google/ExoPlayer

Issue description

When seeking to 1603000ms in the mp3 file it seems to seek approx. 3 seconds earlier.
Seeking the mp3 file using Audacity(or VLC) it seeks to the expected position (as you can hear the expected audio).
I'm not sure it's relevant but the mp3 was created by extracting the aac stream of an mp4 video file using ffmpeg. (The original mp4 is seeked correctly by ExoPlayer while the mp3 does not)

Reproduction steps

  1. Using ExoPlayer demo app seek to 1603000ms. (I manually changed the code)
  2. The expected audio should say "I'll a..." and instead it says "are you asking me..."
    This happens consistently across devices.

Link to test content

A link to the mp3 file was emailed to dev.[email protected].

A full bug report captured from the device

Full bug reported was emailed to dev.[email protected]

Version of ExoPlayer being used

2.11.0

Device(s) and version(s) of Android being used

Was reproduced on -
OnePlus 6 running Android 10
Google Pixel 3 running Android 10
Virtual device Google Pixel 2 Running Android 9
Virtual device Nexus 5X running Android 7.1.1

enhancement

Most helpful comment

This is supported in the dev-v2 branch via a new Mp3Extractor.FLAG_ENABLE_INDEX_SEEKING flag. This will be included in 2.13.0.

All 14 comments

I get "You are not authorized to download this file." when I try and download the file. Please could you make it available?

I've sent a new link

Thanks! Unless you're going to use a constant bitrate, MP3 is fundamentally not well suited to use cases that require exact seeking. There are two reasons for this:

  1. For exact seeking, a container format will ideally provide a precise time-to-byte mapping in a header. This mapping allows a player to map a requested seek time to the corresponding byte offset, and start requesting/parsing/playing media from that offset. The headers available for specifying this mapping in MP3 are, unfortunately, often imprecise. The sample you've provided uses a XING header, which specifies the mapping for 100 points that have a byte granularity equal to 1/256th of the length of the file in bytes. For your sample, this means a time-to-byte mapping is specified for points approximately 18 seconds apart, and each of these mappings may be off by ~20KB. So the mapping is both quite sparse and limited in accuracy.
  2. For container formats that don't provide a precise time-to-byte mapping (or any time-to-byte mapping at all), it's still possible to perform an exact seek if the container includes absolute sample timestamps in the stream. In this case a player can map the seek time to a best guess of the corresponding byte offset, start requesting media from that offset, parse the first absolute sample timestamp, and effectively perform a guided binary search into the media until it finds the right sample. Unfortunately MP3 does not include absolute sample timestamps in the stream, so this approach is not possible.

Ultimately, this means that the only way to perform an exact seek into this type of MP3 is to scan the entire file and manually build up a time-to-byte mapping in the player. This obviously doesn't scale well to large MP3 files, particularly if the user tries to seek to near the end of the stream shortly after starting playback, which would require the player to wait until it's downloaded and indexed the entire stream before performing the seek. For ExoPlayer we decided to optimize for seeking speed over accuracy in this case.

We do have plans to support exact seeking by building up an index, however we'll most likely disable this option by default (if we do this, it'll be possible to enable it with a flag). I will keep this issue open to track this enhancement. If you control the media you're playing, I would suggest that you use a more suitable container format (i.e. MP4).

Thanks a lot for your detailed explanation.

Since we are the one to extract the audio stream from the original mp4 video we control the media.
If the requirement for the seek of the streamed audio is to be both fast and exact - is CBR mp3 better than mp4/m4a here? any considerations regarding the codec?

As a side note - as one who's been working with ExoPlayer in the last 5 years (even had the opportunity to make a small contribution to the project :)) - I think you're doing an incredible job in making our life much easier - keep up the good work!

MP4/M4A is always a better choice. IMO there aren't really any valid use cases for MP3 any more, unless you need to use/support it for legacy reasons.

p.s. Thanks! Happy to help :).

I've inspected our part of the code that extracts the audio using ffmpeg and noticed that we do use CBR of 48kbps. To make sure I've analyzed the mp3 we're talking about and saw that is indeed the case (see attached screenshot).
What's actually hapenning here? is the XING header disrupting the seek calculations even though the mp3 itself is CBR?
Annotation 2019-12-20 232159

What's actually hapenning here? is the XING header disrupting the seek calculations even though the mp3 itself is CBR?

That sounds quite likely.

  1. Should I strip the XING header? (is an mp3 without a XING header is still valid?)
  2. Can the player detect if the mp3 is indeed CBR and ignore the XING header when doing seek calculations? (This would be my preferred option)

Why isn't your preferred option to use a container format that's appropriate for your use case? Even the people who made MP3 don't think you should use it any more.

mp3 is still very popular amongst consumers. However, most state-of-the-art media services such as streaming or TV and radio broadcasting use modern ISO-MPEG codecs such as the AAC family or in the future MPEG-H. Those technologies, that have been developed with major contributions from Fraunhofer IIS, can deliver more features and a higher audio quality at much lower bitrates compared to mp3

My understanding of XING headers are that they're only for VBR content, so if your file is CBR I'm not sure why it's ended up with a XING header in the first place (if you do some research for XING header, most references on the internet suggest that they're only used for VBR content). So yes, if you can generate the CBR MP3 without the XING header, I would expect that to work. We don't support your second suggestion.

You're right - using a different container is probably the right approach.
My second suggestion is just an optimization/work-around that would fit my case exactly - but I completly understand if it doesn't seem justified as a general approach.

I guess we'll probably need to go over all of our already generated mp3 (there are a lot) and perform some kind of adjustment - removing the XING header or transcoding to a different container, and change the way we generate new mp3 files.

Thanks for the tip in the right direction.

Just a note on the competitive front - iPhone's AVPlayer seeking is precise on the same mp3 - so it probably ignores the XING header in this case.

This is supported in the dev-v2 branch via a new Mp3Extractor.FLAG_ENABLE_INDEX_SEEKING flag. This will be included in 2.13.0.

@ojw28 how to enable or use this flag?
Is there some initializer where I can pass it?
Thanks

It is enabled by using FLAG_ENABLE_INDEX_SEEKING, which can be set on a DefaultExtractorsFactory using setMp3ExtractorFlags.

The documentation has not been updated yet as this functionality has not been released yet.

Was this page helpful?
0 / 5 - 0 ratings