Hi all,
First of all, thanks a lot to all the contributors of Librosa!
I would like to load an audio file with an amplitude that is greater than 1. If I am correct, librosa.load() only provides an output audio signal which lies between -1 and 1.
Is this a desired behavior?
You will find below a code illustrating this issue.
Best,
Simon
import librosa
import numpy as np
import os
from scipy.io import wavfile
tmp_folder = path_to_a_temporary_folder
# Load signal
y, sr = librosa.load(librosa.util.example_audio_file(), sr=None)
print('Original amplitude: %.2f' % np.max(np.abs(y)))
# Scale the signal to lie between -5 and 5
y_scaled = y/np.max(np.abs(y))*5
print('New amplitude: %.2f' % np.max(np.abs(y_scaled)))
# Write scaled signal
output_file = os.path.join(tmp_folder, 'scaled.wav')
librosa.output.write_wav(output_file, y_scaled, sr)
# Load scaled signal with librosa
y_scaled_reloaded, sr = librosa.load(output_file, sr=None)
print('Read amplitude with librosa: %.2f' % np.max(np.abs(y_scaled_reloaded)))
# Load scaled signal with scipy
sr, y_scaled_reloaded_sp = wavfile.read(output_file)
print('Read amplitude with scipy: %.2f' % np.max(np.abs(y_scaled_reloaded_sp)))
Original amplitude: 0.71
New amplitude: 5.00
Read amplitude with librosa: 1.00
Read amplitude with scipy: 5.00
Linux-4.4.0-116-generic-x86_64-with-debian-stretch-sid
Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19)
[GCC 7.2.0]
NumPy 1.14.2
SciPy 1.0.1
librosa 0.6.0
Thanks for reporting this. I had originally thought this was due to audioread doing some re-normalization, but upon closer inspection, I'm not sure that's the case. It does seem like the most likely candidate, though.
Background: librosa uses audioread to handle audio codecs, and audioread in turn multiplexes over different codec libraries (gstreamer, ffmpeg, etc). It's possible that whatever is being used under the hood is doing this renormalization. Can you do the following:
>>> import audioread
>>> reader = audioread.audio_open('/path/to/your/file.wav')
>>> print(reader)
and report what kind of decoder it's using? I get RawAudioFile on my machine, but you might get something different.
I would like to load an audio file with an amplitude that is greater than 1. If I am correct, librosa.load() only provides an output audio signal which lies between -1 and 1.
There's nothing intrinsic to librosa.load which forces this (the source code should confirm this). Resampling can sometimes change the peak magnitude, but you're loading without resampling, so that's not the problem.
write_wav has the option to stretch the signal to +-1 before writing, but this is disabled by default, and probably not responsible for the problem you're seeing.
As a side note, I'll put in my blanket disclaimer that you're probably better off using pysoundfile than librosa.load or audioread, unless you really need mp3 support.
Thanks a lot for your answer.
Can you do the following:
import audioread
reader = audioread.audio_open('/path/to/your/file.wav')
print(reader)
Here is what I get:
<audioread.ffdec.FFmpegAudioFile object at 0x7ff45f0578d0>
Thanks for your side note, I will switch to this for loading wav files then :)
Ok, I'll accept the difference in backend loader as an explanation for the difference here. Thanks again for noting this!
Most helpful comment
Ok, I'll accept the difference in backend loader as an explanation for the difference here. Thanks again for noting this!