I've been watching some talking videos on youtube and for the most time I've been doing this on x1.5 or x2. Then I've tried watching the same videos via mpv on the same speed, and I've noticed that youtube scaler produces much clearer result. There are no distortions, and you can clearly understand what is being said.
Then I've read what there is in man re scaletempo filter, and tried using some tricks that are there to increase the quality, but was unable to achieve similar quality result.
It would be really nice if we could really use high-speed viewing in mpv without distortions or similar effects.
Try rubberband.
And at least in Firefox/Windows, they sound almost exactly the same at the same speeds. Chrome sounds a little bit better but still more garbled than rubberband.
Rubberband immediately creates echo chamber effect. It has some options, but it is really unclear what all these options do and how I could increase quality.
You're going to have to post some samples and exact software names and versions. Just "make it sound better" isn't reasonable.
Unless they've changed meanwhile, Firefox uses exactly the same SoundTouch algorithm (which mpv is also based/inspired on). Chrome uses their own NIH version of the same WSOLA technique. They shouldn't sound that drastically different from each other.
Lets take this video. Play it in youtube (I was testing in chrome), try increasing speed to x2. Please use headphones. If you do the same with mpv and rubberband, well, it is still detailed and you can understand the speech and there are no distortions like with scaletempo, but you can hear that there is an echo as if it was recorded in pretty small room.
That "echo" you mention with rubberband can be configured. There are a lot of configurations available if you're unhappy with the defaults. They're different techniques of stretching/shrinking the playback rate without changing pitch.
Also, let's use a speech sample instead: https://i.fsbn.eu/pub/mpv-speed/
So what you want exactly is a port of Chrome's code to mpv, which sounds negligibly better, when you could just use rubberband and actually understand the speech.
Feel free to port Firefox' implementation to mpv, since librubberband is a bit "annoying" in multiple aspects.
That "echo" you mention with rubberband can be configured. There are a lot of configurations available if you're unhappy with the defaults. They're different techniques of stretching/shrinking the playback rate without changing pitch.
Yes, I know that, but there is no proper documentation in man file, and in help. Help really sounds like chinese without giving any explanation what each option means. So the only way to configure it is by poking random settings, testing the result and hoping for the best (and there are over 1000 possible combinations of these options plus one option with floating number value).
So what you want exactly is a port of Chrome's code to mpv, which sounds negligibly better, when you could just use rubberband and actually understand the speech.
No, what I want is to be able increase speed without experiencing any artifacts (such as echos, distortions, and so on).
So the only way to configure it is by poking random settings, testing the result and hoping for the best (and there are over 1000 possible combinations of these options plus one option with floating number value).
http://breakfastquay.com/rubberband/code-doc/classRubberBand_1_1RubberBandStretcher.html
No, what I want is to be able increase speed without experiencing any artifacts (such as echos, distortions, and so on).
Feel free to come up with a better and LGPL or freer algorithm then. If it's way better than SoundTouch and rubberband I'm sure other open source software would appreciate it too.
@TiGR commented on May 11, 2017, 6:01 PM GMT-3:
Rubberband immediately creates echo chamber effect. It has some options, but it is really unclear what all these options do and how I could increase quality.
When I started to mess with mpv's librubberband implementation, I was even struggling to get the right --af syntax for the filter, but once I figured that one out, the best I was able to come up with to get rid of the very prominent echo/ringing/phase distortion was this:
--af="rubberband=transients=smooth:pitch=quality:window=short"
I'm satisfied how it sounds with these settings, give those a try I guess.
And for input.conf I have these binds setup:
[ af add "@rubber:rubberband=transients=smooth:pitch=quality:window=short" ; add speed -0.25
] af add "@rubber:rubberband=transients=smooth:pitch=quality:window=short" ; add speed +0.25
\ af del @rubber ; set speed 1.0
@wm4 Seems like we need “af enable/disable” commands to make @garoto's example less ugly?
Either way, shouldn't rubberband be a no-op for speed 1.0 already.
We already have such commands.
@wm4 Really? All I can see are:
af string, and can't be used without parametersI'm not sure what I'm missing. I know you can directly write to the enabled property, but doesn't that require knowing what index you need to write to?
RTFM, search for @deband for an example.
@haasn commented on May 12, 2017, 2:56 PM GMT-3:
Either way, shouldn't rubberband be a no-op for speed 1.0 already.
Seems that set speed 1.0 will not remove rubberband from the af filter chain (which seems reasonable if my poor logic is correct). If I remove the af del @rubber part, I need to manually clear the filter chain with af set "" to have audio play without distortion, because even at 1.0 speeds, rubberband will still be filtering audio.
Sorry if this was not what your quoted comment was about.
ps: in hindsight, everything in this comment of mine look pretty obvious. Oh well.
The fact that rubberband isn't no-op at speed=1 is a known issue since it was first added.
@garoto thanks, this is better, but still produces some sort of robotic sound.
So, I've compared Firefox, VLC, MPV (both rubberband and scaletempo in default and custom settings), Chrome and Chromium. Chrome and Chromium produce equal result, so I think that it should be open source. So, if I were to rate these, the rating would be:
I know nothing about the technologies being used or maybe specific tuning that was made in each project, I am just user here.
@garoto Thanks I will try your settings.
I also found this related ticket about rubberband::reset (to fix sound at 1.0 speed?) and getLatency (to fix mpv's desync when using rubberband):
@garoto @TiGR Those settings are bad. They seem to reduce the processing buffer size, which gets rid of phasiness a bit but increases "warbling"/fluttering.
All I have in my mpv.conf is this, instead: af=rubberband=channels=together.
That's the most important tweak. It tells rubberband to keep the left and right channels in sample-accurate sync. "channels=together" means "use the stereo-coherent phase mode". That's what gets rid of the phasiness. But it costs a LOT more CPU usage. So use it at your own risk.
None of the other tweaks are necessary. In fact, trying the other tweaks together with my channels=together just made the sound worse (made it warble/flutter).
So we should change the defaults for this one?
@wm4 Rubberband defaults to "process each channel independently and spit them out whenever they're ready, even if that means they drift out of stereo sync", which creates a very weird phase/tunnel sound effect in headphones.
It can be fixed by running channels=together. The problem with channels=together is that it increases Rubberband's CPU requirement quite a lot. I suppose it is some complex algorithm which syncs the two channels and their independent post-processed buffers.
You can try it out and check the CPU usage increase when scaling the video to a speed of 0.25 or something heavy like that. From memory, running Rubberband in channel-together mode increased the CPU usage a lot. But if you deem that the CPU usage is acceptable as a new default then sure go ahead and add channels=together as a default. It sounds much better. :-)
channels=together IS the default. Since https://github.com/mpv-player/mpv/commit/f504661852dbd7b8ff28013ffed069de75de1826.
@wiiaboo Nice discovery!
@TiGR @wiiaboo
I've fine tuned the settings a bit and this is what I got.
af=scaletempo=stride=16:overlap=.68:search=10
Sounds similar to chrome's implementation enough for me at x2 speed. Just posting this for others to test.
(I stuck in random values and adjusted based on my subjective listening)
Doesn't search only affect performance?
It also seems to have an effect on how robotic the audio sounds.
@stevenxed Cool. It is perfect on speech. However, it adds some strange distortions on some kind of music, for example, here (first 28 seconds) on x2. Interestingly, default scaletempo plays it well. And yes, it requires some extra processing (around 4 extra percent points on my CPU. CPU usage increased from 17% to 21% on FullHD x2 playback).
@TiGR
Try this: af=scaletempo=stride=22:overlap=.55:search=12
The music with onlyscaletempoon x2 speed sounds a tiny bit better/clearer than chrome's implementation. So yeah, you gotta compromise with music/speech. What I gave you should be good enough. Feel free to tweak it yourself ofc
I've stumbled upon this discussion in the search for better scaletempo settings, so I came back to share what I found out. The best I was able to come up with over the past couple of days is this: af=scaletempo=stride=28:overlap=.9:search=25
So far I was not able to notice any artifacts or distortions with this setting while watching videos or listening to music (for the purpose of testing) at different speedups.
It's harder on the CPU compared to the settings you have proposed so far (6%-7% instead of 2%-3% while double speed music on my AMD 1100T), however for me that tradeoff is well worth it.
Also worth noting is that i have noticed a "rule" while testing different settings. Higher values for search are generally better, however search >= stride * overlap leads to artifacts or distortions in some situations.
So we should change the defaults for this one?
@wm4 Please, yes. I use increased playback speed almost every day and no proposed solution has been flawless. I couldn't tell you how much this matters.
To what?
@wm4
One of the problems with solutions on this issue (e.g. af=scaletempo=stride=28:overlap=.9:search=25) is that they work for speeds of x1.5-x1.7, but they affect the clarity of x1 and sound just as bad with x2+.
Maybe Chromium's implementation? Artifacts are minimal regardless of what the playback speed is. (https://github.com/mpv-player/mpv/issues/6471#issuecomment-472608028 -- sadly, I'm no C developer.)
Most values for af have been produced by entering random numbers as stated. I think you'd have better luck.
when this will be fixed? chromium implementation is great
Nobody is working on it.
one of the main features of player is broken, and nobody cares, great
great a volunteer, we are looking forward to your contribution of a new scaletempo filter.
Maybe I just use a better player instead? if you can't fix it in three years. Mpc-be for example
that's totally valid and we welcome your move into greener pastures. mpc-be is a great player for windows. thank you for your kind input and we hope you'll enjoy the experience mpc-be will provide you in the future.
@Akemi This random commit from Chromium suggests that they're using a certain WSOLA algorithm. And perhaps the source code for their implementation of the algorithm is this header file and this C++ file. (Again, I'm not a C/C++ dev)
it's probably the best to request this from ffmpeg, so not only mpv can benefit from it, tbh.
@Akemi ffmpeg already has wsola algorithm called atempo, sounds a bit worse than chromium https://ffmpeg.org/ffmpeg-filters.html#atempo
https://ffmpeg.org/doxygen/3.2/af__atempo_8c.html
mpv's implementation is also WSOLA.
ffmpeg's implementation can't be used in mpv.
whatever they're both suck compared to chromium
For 2.25x audio playback, the atempo filter is performing waaay better for me than rubberband and scaletempo. (tried many different option settings)
I recommend folks give it a try! Though there's some serious caveats.
When using --af=atempo=... the mpv playback speed and atempo speed don't smartly synchronize like they do with rubberband and scaletempo. They're completely independent, so atempo always desyncs the audio and playback position. And if there's also an mpv playback speed those end up getting multiplied.
For example...
With mpv --af=atempo=2 --speed=1 I get to experience atempo's near-Chrome quality speed up, but mpv isn't aware of the speed change and you'll finish listening to the audio when you're half way through the file.
With mpv --af=atempo=2 --speed=2, the audio plays at 4x speed! I believe atempo applies a 2x speed increase to the 2x scaletempo speed increase mpv already applied.
Very confusingly, with mpv --af=atempo --speed=2, you don't get any of the atempo benefits since this just defaults to meaning --af=atempo=1 and mpv uses the default scaletempo to actually do the speedup. I know I tried out --af=atempo months ago and didn't see any improvement so gave up on it. Little did I know I wasn't even using it!
I assume the root cause of this is that atempo is a libavfilter and not something built-in to MPV? This note explains...:
having libavfilter based filters after the scaletempo or rubberband
filters is not supported anymore, and may desync if playback speed is
changed (libavfilter does not support the metadata for playback speed)
This desyncing means atempo is useless for speeding up video playback... Though I found a hacky solution by doing mpv --af=atempo=2 --vf=lavfi=[setpts=PTS/2] which doubles video playback speed in a non-mpv-aware way. It mostly works when the video starts but seeking instantly desyncs you.
Anyone have ideas for workarounds? Anyway to sync up atempo with mpv/video playback speed?
Anyway to sync up atempo with mpv/video playback speed?
No, that's not possible. FFmpeg doesn't offer the required APIs.
What APIs are actually missing in FFmpeg?
Audio speed & virtual PTS.
Isn't speed done with asetrate filter? What is virtual PTS?
It PTS for speed changed audio that ffmpeg shitfuckery can't comprehend. Computed from virtual samplerate, which is real samplerate multiplied with fractional (well, float) speed value for maximum precision. Playback uses real samplerate, PTS calculations and A/V sync use virtual PTS.
I have patch that prove no additional APIs in libavfilter are required to get atempo filter working with mpv. I just cant get it to work with 1.5/or 1.3 tempo, nice numbers like 2.0 works fine.
Patch is wrong.
i will send patch to fix mpv soon.
FFmpeg is bug
Grow mushrooms.
It would be great if someone could try out the code from my pull request.
@DorianRudolph wow, thanks for your PR! I tested it with a few videos and it seems it works great to me!
It would be great if someone could try out the code from my pull request.
How can I use this?
Adding --af=scaletempo2 should be enough (and --speed N like normal), but you need to be on a mpv build newer than the newest release for it to be available.
Most helpful comment
@TiGR @wiiaboo
I've fine tuned the settings a bit and this is what I got.
af=scaletempo=stride=16:overlap=.68:search=10Sounds similar to chrome's implementation enough for me at x2 speed. Just posting this for others to test.
(I stuck in random values and adjusted based on my subjective listening)