This is something I noticed today when using fetchart with the fanarttv plugin - one random albumart file it downloaded was actually a PNG file, not a JPG. This wasn't fetchart's fault - I verified that the source image was indeed a PNG from fanarttv despite having the .jpg extension.
But, I think a nice feature would be to include some form of image validty check and perhaps offer to convert to the image to the desired default format.
Wow! That's too bad. Is there a way we can reproduce this (i.e., can we fake fetching art for the album you have to get the same image)?
Ha, it's hardly a troubling issue, but thought it was worth mentioning! The album in question was 9 by Damien Rice. If you grab that from fanarttv you'll see what I mean.
Gnome image viewer wouldn't open it. Gimp opened it (I assume it ignores file types and goes by the file header instead) and when I checked it with the file command, it said it was a PNG.
$ wget -q https://fanart.tv/detailpreview/fanart/music/2119beb8-6ac5-4f21-82a4-b831c90c0024/albumcover/9-4f31935ae2a25.jpg
$ file 9-4f31935ae2a25.jpg
9-4f31935ae2a25.jpg: PNG image data, 280 x 280, 8-bit/color RGB, non-interlaced
Yep, certainly seems like an issue! Thanks @broddo 馃槃.
That's the one. Sorry I should have provided you with the link. I'm half asleep here.
Haha, no problem! 馃憤 @sampsyo Do we actually make an attempt at detecting the file type currently and converting if it's not a JPEG?
Aha!
We don't check the data itself, but we do check the MIME type that the server reports. Fortunately, it looks like fanart.tv doesn't lie there:
$ curl -I "https://fanart.tv/detailpreview/fanart/music/2119beb8-6ac5-4f21-82a4-b831c90c0024/albumcover/9-4f31935ae2a25.jpg"
HTTP/1.1 200 OK
Date: Thu, 16 Jun 2016 23:03:59 GMT
Content-Type: image/png
The trick is that our fetch_image method only validates that the file is either reported as a PNG or a JPEG; it doesn't assign the extension based on that. In fact, unless I'm reading this wrong, I think the file extension is actually _hard-coded_ as JPEG? In any case, we should instead use the MIME type to decide the extension for the downloaded file.
Assigning this to myself as a reminder, but I'll likely not tackle it until in a few weeks. So if anyone wants to step up, feel free to do so, no need to notify me.
Unfortunately the files @jackwilsdon and @sampsyo mentioned are only the website previews, not the full-size images returned by the API, and for the latter fanart.tv does lie:
curl -I https://fanart.tv/fanart/music/2119beb8-6ac5-4f21-82a4-b831c90c0024/albumcover/9-4f31935ae2a25.jpg
HTTP/1.1 200 OK
Date: Tue, 21 Jun 2016 09:20:57 GMT
Content-Type: image/jpeg
so the fix in #2068 will not solve your problem.
On our side, this would need some file-like type detection. I'll probably also report this upstream. IIRC the guideline on fanart.tv is only 1000px jpeg, but apparently, images can slip through that requirement.
What do you think of using python-magic, @wordofglass? The issue we have then is people need to install libmagic, which can be especially a pain on Windows.
$ wget -q https://fanart.tv/fanart/music/2119beb8-6ac5-4f21-82a4-b831c90c0024/albumcover/9-4f31935ae2a25.jpg
$ python -c "import magic; print(magic.from_file('./9-4f31935ae2a25.jpg'))"
PNG image data, 1000 x 1000, 8-bit colormap, non-interlaced
Admittedly we'd need to parse the magic type but it's much better than any other solutions I can come up with see below.
Another idea I just came up with is using the identify command that comes with ImageMagick (which is probably better, although it would add a hard dependency on ImageMagick for fetchart):
$ wget -q https://fanart.tv/fanart/music/2119beb8-6ac5-4f21-82a4-b831c90c0024/albumcover/9-4f31935ae2a25.jpg
$ identify -format "%m" 9-4f31935ae2a25.jpg
PNG
I've already updated the PR (#2068) using the standard library imghdr. This module is pretty basic, but I think it's sufficient for our purposes and does not introduce a new dependency. Unless there's a significant functional advantage of your suggestions, I'd stick to this.
Is it correct that really only jpeg and png are relevant image formats for beets? Then I see no need for an additional library.
Ah, I didn't realise you'd already sorted it out (the PR still says "EDIT: doesn't fix #2053, see there").
Updated the post :sparkles:
Most helpful comment
Updated the post :sparkles: