Librosa: Loading audio files with an amplitude greater than 1

Created on 16 Apr 2018  路  3Comments  路  Source: librosa/librosa

Description

Hi all,

First of all, thanks a lot to all the contributors of Librosa!

I would like to load an audio file with an amplitude that is greater than 1. If I am correct, librosa.load() only provides an output audio signal which lies between -1 and 1.

Is this a desired behavior?

You will find below a code illustrating this issue.

Best,

Simon

Steps/Code to Reproduce

import librosa
import numpy as np
import os
from scipy.io import wavfile

tmp_folder = path_to_a_temporary_folder

# Load signal
y, sr = librosa.load(librosa.util.example_audio_file(), sr=None)
print('Original amplitude: %.2f' % np.max(np.abs(y)))

# Scale the signal to lie between -5 and 5
y_scaled = y/np.max(np.abs(y))*5
print('New amplitude: %.2f' % np.max(np.abs(y_scaled)))

# Write scaled signal
output_file = os.path.join(tmp_folder, 'scaled.wav')
librosa.output.write_wav(output_file, y_scaled, sr)

# Load scaled signal with librosa
y_scaled_reloaded, sr = librosa.load(output_file, sr=None)
print('Read amplitude with librosa: %.2f' % np.max(np.abs(y_scaled_reloaded)))

# Load scaled signal with scipy
sr, y_scaled_reloaded_sp = wavfile.read(output_file)
print('Read amplitude with scipy: %.2f' % np.max(np.abs(y_scaled_reloaded_sp)))

Results


Original amplitude: 0.71
New amplitude: 5.00
Read amplitude with librosa: 1.00
Read amplitude with scipy: 5.00

Versions

Linux-4.4.0-116-generic-x86_64-with-debian-stretch-sid
Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19)
[GCC 7.2.0]
NumPy 1.14.2
SciPy 1.0.1
librosa 0.6.0

IO Upstreadependency bug wontfix

Most helpful comment

Ok, I'll accept the difference in backend loader as an explanation for the difference here. Thanks again for noting this!

All 3 comments

Thanks for reporting this. I had originally thought this was due to audioread doing some re-normalization, but upon closer inspection, I'm not sure that's the case. It does seem like the most likely candidate, though.

Background: librosa uses audioread to handle audio codecs, and audioread in turn multiplexes over different codec libraries (gstreamer, ffmpeg, etc). It's possible that whatever is being used under the hood is doing this renormalization. Can you do the following:

>>> import audioread
>>> reader = audioread.audio_open('/path/to/your/file.wav')
>>> print(reader)

and report what kind of decoder it's using? I get RawAudioFile on my machine, but you might get something different.

I would like to load an audio file with an amplitude that is greater than 1. If I am correct, librosa.load() only provides an output audio signal which lies between -1 and 1.

There's nothing intrinsic to librosa.load which forces this (the source code should confirm this). Resampling can sometimes change the peak magnitude, but you're loading without resampling, so that's not the problem.

write_wav has the option to stretch the signal to +-1 before writing, but this is disabled by default, and probably not responsible for the problem you're seeing.


As a side note, I'll put in my blanket disclaimer that you're probably better off using pysoundfile than librosa.load or audioread, unless you really need mp3 support.

Thanks a lot for your answer.

Can you do the following:

import audioread
reader = audioread.audio_open('/path/to/your/file.wav')
print(reader)

Here is what I get:
<audioread.ffdec.FFmpegAudioFile object at 0x7ff45f0578d0>

Thanks for your side note, I will switch to this for loading wav files then :)

Ok, I'll accept the difference in backend loader as an explanation for the difference here. Thanks again for noting this!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

pseeth picture pseeth  路  3Comments

ghost picture ghost  路  3Comments

Yaxiong2015 picture Yaxiong2015  路  3Comments

Atlantic8 picture Atlantic8  路  3Comments

GPrathap picture GPrathap  路  3Comments