Beet doesn't detect at least the Ä and Ö characters, preventing from managing music with applicable filenames. Examples of artists with a diaeresis include Motörhead and Mötley Crüe.
Running beet -vv import -A "E:\CUETools\Motörhead" in verbose (-vv) mode:
user configuration: C:\Users\user\AppData\Roaming\beets\config.yaml
data directory: C:\Users\user\AppData\Roaming\beets
plugin paths: C:\Users\user\beets\myplugins
Sending event: pluginload
library database: C:\Users\user\AppData\Roaming\beets\library.db
library directory: C:\Users\user\Music
Sending event: library_opened
error: no such file or directory: E:\CUETools\Motrhead
My configuration (output of beet config) is:
directory: C:\Users\user\Music\
library: C:\Users\user\AppData\Roaming\beets\library.db
import:
copy: no
write: no
resume: ask
quiet_fallback: skip
timid: no
log: beetslog.txt
ignore: .cue .log .pdf .accurip .m3u8 .m3u .txt .nfo
art_filename: cover
plugins: convert
pluginpath: ~/beets/myplugins
threaded: yes
ui:
color: yes
paths:
default: $albumartist/$album/$track $title
singleton: single songs/$artist - $title
comp: $album/$track $title
albumtype:soundtrack: soundtracks/$album/$track $title
convert:
copy_album_art: yes
embed: no
never_convert_lossy_files: yes
format: opus
formats:
opus:
command: ffmpeg -i $source -acodec libopus -b:a 128k $dest
extension: opus
aac:
command: ffmpeg -i $source -y -vn -acodec aac -aq 1 $dest
extension: m4a
alac:
command: ffmpeg -i $source -y -vn -acodec alac $dest
extension: m4a
flac: ffmpeg -i $source -y -vn -acodec flac $dest
mp3: ffmpeg -i $source -y -vn -aq 2 $dest
ogg: ffmpeg -i $source -y -vn -acodec libvorbis -aq 3 $dest
wma: ffmpeg -i $source -y -vn -acodec wmav2 -vn $dest
dest:
pretend: no
threads: 8
max_bitrate: 500
auto: no
tmpdir:
quiet: no
paths: {}
no_convert: ''
album_art_maxwidth: 0
These kinds of encoding issues are really hard to debug on Windows. (To be clear, this typically isn't a problem on Unix OSes.) It's hard to say what's going on here, but any chance you could try fiddling around with your codepage settings for cmd.exe?
Beyond that, anybody who runs Windows is hereby invited to go spelunking to see if they can reproduce and narrow down what locale business might be causing this for you.
any chance you could try fiddling around with your codepage settings for cmd.exe?
I use Powershell, I try changing the encoding setting tomorrow.
What about passing your import directory like:
\\?\<Drive>:\<directory to be imported>
That is the format I use for my music location when music is to be moved. This gets past encoding and path like issues so far on WIN10_X64.
What about passing your import directory like:
\\?\<Drive>:\<directory to be imported>That is the format I use for my music location when music is to be moved. This gets past encoding and path like issues so far on WIN10_X64.
Assuming that I did this correctly, beet import -A \\?\E:\CUETools\Motörhead:
error: no such file or directory: \\?\E:\CUETools\Motrhead
I changed Powershell encoding to UTF-8, still have the same issue.
https://stackoverflow.com/questions/40098771/changing-powershells-default-output-encoding-to-utf-8
$PSDefaultParameterValues['Out-File:Encoding'] = 'utf8'
verified with $PSDefaultParameterValues['Out-File:Encoding']:
utf8
What happens if you cd to the folder and use relative import? beet import -A .
I recommend Cmder. This issue keeps coming up.
https://github.com/beetbox/beets/issues/2607
Alright so it seems like I've somewhat worked out what's happening;
beets.ui.commands.import_func during importcp1252 on my Windows install)\xf6 representing the ö: b'Mot\xf6rhead'beets.util.normpath which passes it to beets.util.syspath\xf6 is not a valid UTF-8 representation (the correct UTF-8 representation for ö is \xc3\xb6).\x00f6 is the value of ö in UTF-16, but I don't think that's relevantpath.decode(..., 'replace') has been used, it doesn't fail and instead replaces the ö with a �.Mot�rhead, which clearly fails spectacularlyI'm not really sure what the right course of action is, but it certainly seems that we shouldn't be encoding the arguments using cp1252 and then decoding them using utf-8.
Sorry if the text above is a bit incoherent, I wrote it as I came across the code. It might be better discussing this in our new Gitter.
TL;DR: the culprit seems to be encoding arguments in cp1252 and decoding them in utf-8.
What happens if you
cdto the folder and use relative import?beet import -A .
Import succeeds then.
Sorry, the command beet -vv import -A "E:\CUETools\Motörhead" in my issue was missing ¨ because I was experimenting if I can pass the command without the diaeresis and copied a wrong line.
I don't think the quotes make any difference
Wow! Very nice work investigating this, @jackwilsdon. It seems like we need to somehow remember that, on Windows, we either (a) need to preserve the Unicode command-line arguments as-is, or (b) re-decode them later using the argument encoding to recover the original filename.
Doing this in a cross-platform way is absolutely crazy-making! I'm really not sure what a clean solution is, but it will need a lot of platform-specific special cases…
I sent this in Gitter but I thought it's worth putting here too and fleshing out a bit:
What are your thoughts on somewhat "abstracting" I/O such that we use unicode internally within beets and delegate to some other layer to handle converting to the system native encoding? As initial phase we could have some form of "Filesystem" layer which handles all of this, expand into a layer which handles arguments passed into the process too.
I'm not sure to what extent we currently use unicode strings vs. platform native strings within beets, but I think it would greatly simplify logic if we could move all of the encoding handling elsewhere and keep the core of beets working in just unicode.
A smart abstraction (something like pathlib) might be really nice! However, there is a downside to representing all paths as Unicode—namely, that paths on Unix are not guaranteed to follow any particular Unicode encoding and in practice often do not. (I followed up on Gitter—still trying to get used to it! :smiley:)
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Most helpful comment
Alright so it seems like I've somewhat worked out what's happening;
beets.ui.commands.import_funcduring importcp1252on my Windows install)\xf6representing theö:b'Mot\xf6rhead'beets.util.normpathwhich passes it tobeets.util.syspath\xf6is not a valid UTF-8 representation (the correct UTF-8 representation foröis\xc3\xb6).\x00f6is the value oföin UTF-16, but I don't think that's relevantpath.decode(..., 'replace')has been used, it doesn't fail and instead replaces theöwith a�.Mot�rhead, which clearly fails spectacularlyI'm not really sure what the right course of action is, but it certainly seems that we shouldn't be encoding the arguments using
cp1252and then decoding them usingutf-8.Sorry if the text above is a bit incoherent, I wrote it as I came across the code. It might be better discussing this in our new Gitter.
TL;DR: the culprit seems to be encoding arguments in
cp1252and decoding them inutf-8.