I'm getting ValueError: Incomplete wav chunk. at random. I have ~2000 wav files downloaded from the same origin, and ~half of them raise the error, while the other half don't.
Here's a link to a StackOverflow question I've asked, currently unanswered.
Assume filenames is a subset of all my audio files, containing fn_good and fn_bad, where fn_good is an actual file that gets processed, and fn_bad is an actual file that raises an error.
def extract_features(filenames):
for fn in filenames:
sr, y = scipy.io.wavfile.read(fn)
print('Signal is: ', y)
print('Sample rate is: ', sr)
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "E:\alon_emanuel_drive\School\Year2\Semester2\67690_AI_as_a_Tool\venvs\prosody-venv\lib\site-packages\scipy\io\wavfile.py", line 248, in read
raise ValueError("Incomplete wav chunk.")
ValueError: Incomplete wav chunk.
1.3.0 1.16.4 sys.version_info(major=3, minor=7, micro=1, releaselevel='final', serial=0)
I tentatively added the "defect" tag, but I'm not sure if this is really a defect, or if the function is correctly reporting an error. For what it's worth, I can read the files with no errors or warnings using wavio, which is just a wrapper of the standard library wave, so I suspect something is wrong in the SciPy code.
I solve the problem by changing this number "4" to "1" in the file wavefile.py file,
in this condition of the _code:len(chunk_id) < 1_
if not chunk_id:
raise ValueError("Unexpected end of file.")
elif len(chunk_id) < 1:
raise ValueError("Incomplete wav chunk.")
but it was by just intuition and good luck, now i wonder why this works and what are the possible reasons?

Can you re-publish the files? I'm getting this for a file with an extraneous 0x00 byte after the last id3 chunk. (Chunk length is 0x39 = 57 but there are 58 more bytes in the file.) The data part had already been read successfully, though, so yes I would consider this a bug.
This should be more forgiving and just become a warning as long as fmt_chunk_received and data_chunk_received are both True. The only way len(chunk_id) < 4 is if it has reached the end of the file, right?
Edit: Actually this byte is not "extraneous", it's the padding byte for an odd-length chunk size. So this was actually even more of a bug in scipy, and the wav file is formatted correctly. See https://github.com/scipy/scipy/pull/12208
Fixed by #12110 I think
Most helpful comment
I tentatively added the "defect" tag, but I'm not sure if this is really a defect, or if the function is correctly reporting an error. For what it's worth, I can read the files with no errors or warnings using
wavio, which is just a wrapper of the standard librarywave, so I suspect something is wrong in the SciPy code.