Mpv: Lavrresample always performs resampling process, leading to non-bitexact output

Created on 5 Aug 2017 · 19Comments · Source: mpv-player/mpv

mpv version and platform

mpv git 56742ec
MacOS 10.12.6 (16G29)

Reproduction steps

Problem 1

Play this DTS-HD MA sample with -ao=pcm -no-config -ao-pcm-waveheader=yes -ao-pcm-file=/Volumes/RamDisk/dts_hd_ma_mpv.wav
Convert the sample to FLAC with FFmpeg (ffmpeg -i 16_48_2.0.dtshd -acodec flac dts_ha_ma_ffmpeg.flac)
Compare WAV (generated by mpv) with FLAC (generated by FFmpeg) in Adobe Audition

Problem 2
Play this 8 channels FLAC sample with -ao=pcm -no-config -ao-pcm-waveheader=yes -audio-channels=stereo -ao-pcm-file=/Volumes/RamDisk/stereo_downmix_mpv.wav.

Expected behavior

Problem 1
In Adobe Audition, Amplitude Statistics should be identical.

Problem 2
The original FLAC is 24-bit. mpv should also produce 24-bit WAV.

Actual behavior

Problem 1
Amplitude Statistics are different.

WAV (mpv)

Left    Right
Peak Amplitude: 0.00 dB 0.00 dB
True Peak Amplitude:    0.75 dBTP   0.37 dBTP
Maximum Sample Value:   32767   32767
Minimum Sample Value:   -32768  -32768
Possibly Clipped Samples:   471 2767
Total RMS Amplitude:    -11.66 dB   -10.73 dB
Maximum RMS Amplitude:  -5.74 dB    -4.15 dB
Minimum RMS Amplitude:  -57.47 dB   -71.95 dB
Average RMS Amplitude:  -15.03 dB   -16.66 dB
DC Offset:  -1.05 % -0.80 %
Measured Bit Depth: 16  16
Dynamic Range:  51.74 dB    67.80 dB
Dynamic Range Used: 50.90 dB    63.55 dB
Loudness (Legacy):  -11.05 dB   -6.16 dB
Perceived Loudness (Legacy):    -5.13 dB    -3.98 dB
ITU-R BS.1770-3 Loudness: -8.21 LUFS

FLAC (FFmpeg)

Channel 1   Channel 2
Peak Amplitude: 0.00 dB 0.00 dB
True Peak Amplitude:    0.73 dBTP   0.69 dBTP
Maximum Sample Value:   32767   32767
Minimum Sample Value:   -32768  -32768
Possibly Clipped Samples:   806 4520
Total RMS Amplitude:    -11.67 dB   -10.74 dB
Maximum RMS Amplitude:  -5.71 dB    -4.14 dB
Minimum RMS Amplitude:  -57.61 dB   -70.44 dB
Average RMS Amplitude:  -15.10 dB   -16.72 dB
DC Offset:  -1.05 % -0.80 %
Measured Bit Depth: 16  16
Dynamic Range:  51.90 dB    66.30 dB
Dynamic Range Used: 51.10 dB    62.70 dB
Loudness (Legacy):  -10.57 dB   -6.20 dB
Perceived Loudness (Legacy):    -5.14 dB    -4.03 dB
ITU-R BS.1770-3 Loudness: -8.33 LUFS

Problem 2
The original FLAC is 24-bit, but mpv produces a 32-bit WAV.

Log file

Problem 1
Log file: http://sprunge.us/CBDC

Problem 2
Log: http://sprunge.us/VLER

Sample files

Problem 1
DTS-HD MA sample

Problem 2
8 channels FLAC sample

Source

macdavis

Most helpful comment

@roberth1990
I use af=lavfi="aformat=sample_fmts=s16|s32:channel_layouts=stereo" . Note that during down mixing, libavfilter adds additional headrooms by default to prevent clipping, thus it will sound quieter.

macdavis on 11 Aug 2017

❤2

All 19 comments

There appear to be two problems:

1) mpv cuts off the first 1024 samples of the dtshd sample
2) even after offsetting for this, the signal is still not lossless

The output of --ao=lavc is identical to that of --ao=pcm, so it doesn't look like the problem is in writing the file.

kevmitch on 6 Aug 2017

@kevmitch

Thanks for testing. Could you please also try this sample (24/96/7.1/ = 0)? Perhaps the cutting off is due to the file itself. There is a <DTS_DELAY> : 1024 metadata in the original sample. The sample uploaded here should be no delay (Foobar 2000 shows <DTS_DELAY> : 0) Playing this 24-bit DTS will result in a 32-bit WAV (instead of 24-bit).

Also tried TrueHD, no problem. (The Amplitude Statistics shows WAV is 24-bit and bitexact)

macdavis on 6 Aug 2017

Tried the second sample on Lin

mpv 0.26.0-95-ga680c643e-dirty (C) 2000-2017 mpv/MPlayer/mplayer2 projects
 built on Sat Aug  5 22:24:53 MSK 2017
ffmpeg library versions:
   libavutil       55.68.100
   libavcodec      57.102.100
   libavformat     57.76.100
   libswscale      4.7.101
   libavfilter     6.95.100
   libswresample   2.8.100
ffmpeg version: N-86848-g03a9e6ff30

mpv's WAV https://0x0.st/dHD.wav
ffmpeg's FLAC https://0x0.st/dHk.flac
Maybe it can help somehow.
Unfortunately, I don't have any Adobe Audition or similar...

kkkrackpot on 6 Aug 2017

Yeah, the 24/96t7.1 is now correctly aligned, but the samples still differ. Looking at mpv's audio filter chain, I see that lavresample is inserted in order to convert from planar (which comes out of ffmpeg's decoder) to interleaved for output. There's no reason why that shouldn't be lossless, but maybe something strange is going on.

I guess I'll have to try dumping the raw data at various stages in the code to see where it's getting altered.

kevmitch on 6 Aug 2017

@kevmitch It's quite possible that mpv's planar to interleaved conversion is not lossless.
Here is what happens

I converted this FLAC sample to NUT (pcm_s16le_planar) and NUT (pcm_s16le). Playing pcm_s16le (Log: http://sprunge.us/ACXc) is bitexact while playing pcm_s16le_planar (Log: http://sprunge.us/XJIX) is not.

Update:
Workaround: Using af=lavfi="aformat=sample_fmts=s16|s32:channel_layouts=stereo"

macdavis on 6 Aug 2017

If I change the lavresample option cutoff, the output changes. This suggests that the signal is needlessly getting resampled to the same rate. I'll have to look in the ffmpeg code to see how they manage to deplanarize without resampling.

kevmitch on 6 Aug 2017

Audiophiles will kill someone for it...

@macdavis Does your workaround fix the issue? Does it work with multichannel too? In my system I sent multichannel PCM to a soundbar that seems to accept (almost) all formats.

kkkrackpot on 10 Aug 2017

@fhlfibh
You can first check mpv's log. If there is no Lavrresample inserted anywhere, you don't need this workaround because mpv's output is still bitexact.
Yes, the workaround fixes this issue and works with multichannel as well. If your hardware only accept interleaved format, just use af=lavfi="aformat=sample_fmts=s16|s32" for multichannel. In my case, my hardware only accept stereo and interleaved format, I need both format conversion and downmixing done by libavfilter to bypass mpv's internal conversion.

macdavis on 11 Aug 2017

@macdavis

Could you post your libavfilter configuration?

roberth1990 on 11 Aug 2017

macdavis on 11 Aug 2017

❤2

@macdavis
Thank you! That setup work much better than any other setups I have tried to avoid distortion/clipping on some difficult audio tracks without loosing too much dynamic range.

roberth1990 on 11 Aug 2017

@macdavis Tried your workaround, but it looks like it just added more mess http://sprunge.us/KSeW
It seems on my system it always inserts lavrresample, for one reason or another...
Anyway, it's better to have lavrresampleitself working correctly.

kkkrackpot on 12 Aug 2017

I'm not sure what's going on here (and I didn't read most of the issue), but mpv code in af_lavrresample clamps float values to range. This is for making sure non-normalized downmixing does not output out of range values, which in turn could lead to unpredictable behavior in AOs. On the pother hand it sounds like floats are not involved?

Also, I guess avresample_set_compensation() might force reinit to resampling unnecessarily.

ghost on 12 Aug 2017

@fhlfibh
The log says the surround configuration of source audio is side left and side right, but that of your hardware is back left and right. That's why lavrresample is inserted ( Remix: 5.1(side) -> 5.1 Fudge: sl-sr -> bl-br)

Try af=lavfi="aformat=sample_fmts=s16|s32:channel_layouts=5.1". That may solve your problem.

macdavis on 12 Aug 2017

@wm4

Also, I guess avresample_set_compensation() might force reinit to resampling unnecessarily.

Yeah, that's the issue I wanna report. Unnecessary resampling process deteriorates the sound quality. Downmixing and planar to interleaved conversion shouldn't have triggered resampling.

On the pother hand it sounds like floats are not involved?

No, it's not about floating point issue. I didn't test floating point.

macdavis on 12 Aug 2017

I would have expected that swr_set_compensation() (what avresample_set_compensation is defined to) does not enable resampling when it's not needed. But I guess swr doesn't agree.

ghost on 12 Aug 2017

Thanks @wm4 that is now bit-exact for --ao=pcm.

@macdavis I see you've altered the original issue to talk about getting s32. You should probably have opened a separate issue for this as significant editing of posts is generally frowned upon since people receive only the initial post via email.

In any case, this is expected since neither ffmpeg nor mpv has has an internal representation for packed s24. Instead, s32 is used with least significant bits set to 0. Unfortunately, there is currently no way for mpv to differentiate between true s32 and s24 in s32, so --ao=pcm just outputs the samples exactly as they're stored. This is still lossless.

@roberth1990 what you want is --audio-normalize-downmix=yes. This currently defaults to no in mpv, because people constantly complained that yes was quieter than VLC.

kevmitch on 13 Aug 2017

👍1

@kevmitch Thanks for your detailed explanation and sorry for the confusion. Next time, I will open a separate issue instead.

macdavis on 15 Aug 2017

@kevmitch I am a bit confused about the alignment on MacOS.

Core Audio defines in the unpacked case, the 24 bits are aligned high within the 4 byte field so that a parser can treat the value as if it were 32 bit integer with the lowest (or least significant) 8 bits all zero). On disk, the little-endian version of this data format looks like this:
00 LL XX MM
where MM is the most significant byte and LL is the least significant.
A big-endian version of 24-bit PCM audio in 4 bytes looks like this:
MM XX LL 00

On MacOS, 24 bits aligned high format matches mpv's s32 format. However, my DAC's format (Also AO format) is 24 Bit Signed Integer Aligned Low in 32 Bit. For my DAC, are the least significant 8 bits or the most significant 8 bits zeroes? I am worried about inconsistent high/low alignment between mpv and my DAC, meaning discarding 8 bits that contain valid information during truncation, instead of discarding zeroes.

macdavis on 8 Jun 2019

Was this page helpful?

0 / 5 - 0 ratings