Here's a bit of code that replicates this:
import librosa
import numpy as np
y, sr = librosa.load(librosa.util.example_audio_file())
n_fft = 4096
hop_length = 1024
n = len(y)
y_pad = librosa.util.fix_length(y, n + hop_length)
D = librosa.stft(y_pad, n_fft = n_fft, hop_length = hop_length)
y_out = librosa.util.fix_length(librosa.istft(D), n)
print np.max(np.abs(y - y_out))
n_fft = 4096
hop_length = 2048
n = len(y)
y_pad = librosa.util.fix_length(y, n + hop_length)
D = librosa.stft(y_pad, n_fft = n_fft, hop_length = hop_length)
y_out = librosa.util.fix_length(librosa.istft(D), n)
print np.max(np.abs(y - y_out))
n_fft = 2048
hop_length = 512
n = len(y)
y_pad = librosa.util.fix_length(y, n + hop_length)
D = librosa.stft(y_pad, n_fft = n_fft, hop_length = hop_length)
y_out = librosa.util.fix_length(librosa.istft(D), n)
print np.max(np.abs(y - y_out))
n_fft = 2048
hop_length = 1024
n = len(y)
y_pad = librosa.util.fix_length(y, n + hop_length)
D = librosa.stft(y_pad, n_fft = n_fft, hop_length = hop_length)
y_out = librosa.util.fix_length(librosa.istft(D), n)
print np.max(np.abs(y - y_out))
n_fft = 2048
hop_length = 256
n = len(y)
y_pad = librosa.util.fix_length(y, n + hop_length)
D = librosa.stft(y_pad, n_fft = n_fft, hop_length = hop_length)
y_out = librosa.util.fix_length(librosa.istft(D), n)
print np.max(np.abs(y - y_out))
Here is what happens when I run it:
1.19209e-07
0.767519
1.19209e-07
0.710516
0.734461
What's happening here? An alignment issue or something else?
Did you try plotting y - y_out? Maybe the large values come from some
difference in framing, so that there are some samples at the end that don't
get resynthesized. Another way to test this would be to pad a window's
worth of zeros to the end of the input and see if that helps.
DAn.
The reason is that there is missing parameter hop_length in istft which you didn't specify. Replacing librosa.istft(D) with librosa.istft(D, hop_length = hop_length) I got:
1.19209e-07
1.49012e-07
1.19209e-07
1.19209e-07
1.49012e-07
Ah shoot, yes that's it, of course. My bad! Thanks so much!
Most helpful comment
The reason is that there is missing parameter
hop_lengthinistftwhich you didn't specify. Replacinglibrosa.istft(D)withlibrosa.istft(D, hop_length = hop_length)I got: