Beets: Switch from pyyaml to ruamel.yaml

Created on 27 Feb 2019  路  7Comments  路  Source: beetbox/beets

Problem

The config parser does not accept the !!omap tag for an ordered mapping.

YAML mappings are, by default unordered, so in order to ensure that a mapping is to be parsd and represented in order, one need to use the ordered map format, indicated by !!omap and a list of one-elemental mappings. However, this is not supported.

~ 螢 beet --version
configuration error: file /home/fichte/.config/beets/config.yaml could not be read: expected a mapping node, but found sequence
  in "/home/fichte/.config/beets/config.yaml", line 11, column 8

Config file:

directory: /data/audio/music/sorted
library: /data/audio/music/sorted/musiclibrary.db

per_disc_numbering: true

import:
  copy: true
  log: beetslog.txt
paths: !!omap
  - sourcetype:Anime: '#soundtrack/%ifdef{source,$source/}$album%aunique{}%ifdef{anime, [$anime]}/$disc_and_track - $title'
  - doujin:true: "#%ifdef{touhou,touhou,doujin}\
      /$albumartist_or_label\
      /$long_date - $album%aunique{}%ifdef{event, [$event]}\
      /$disc_and_track - $title"
  - albumtype:soundtrack: '#soundtrack/%if{source,$source/}$year - $album%aunique{}/$disc_and_track - $title'
  - comp: $albumartist_or_label/$year - $album%aunique{}/$disc_and_track - $artist - $title
  - singleton: $albumartist/$title
  - default: $albumartist/$year - $album%aunique{}/$disc_and_track - $title

album_fields:
  long_date: '"-".join(f"{n:02}" for n in [year, month, day] if n)'
  soundtrack_source: 'f"{source}/" if locals().get("source", None) and source != album else ""'

item_fields:
  disc_and_track:
    f"{disc}.{track:0{max(2, len(str(tracktotal)))}}"
    if disctotal > 1
    else f"{track:0{max(2, len(str(tracktotal)))}}"
  albumartist_or_label: label if comp and label else albumartist
plugins: fetchart lastgenre missing duplicates inline replaygain

lastgenre:
  count: 1

replaygain:
  backend: gstreamer

Via setup.py, beets uses pyyaml. I'd like to suggest using ruamel.yaml instead, which is the only YAML 1.2 parser in Python that I know of and has been maintained for a long time. pyyaml has recently switched maintainers, but there are still many inherent problems with the distribution, most notably yaml.load being unsafe by default, and the fact it still only parses YAML1.1.
This is only a minor inconvenience as the current setup works, although it technically shouldn't be relied on by the YAML spec.

Setup

  • OS: Arch
  • Python version: 3.7
  • beets version: 1.4.8+ (branch referenced in #2988)
duplicate feature

Most helpful comment

With the current title this looks like a duplicate of #1702

All 7 comments

Hello! For what it's worth, !!omap is actually not necessary for your paths configuration. We have specifically customized pyyaml to parse _all_ mappings as Python OrderedDicts. We like this approach because it makes configurations easier to write by hand.

If you're curious, here's where the magic happens:
https://github.com/beetbox/beets/blob/f6ad98a4a51825ddb4ae31fff6ed8d3ce8530fdb/beets/util/confit.py#L616-L664

Thanks for pointing out ruamel.yaml, which does indeed look like it might be a better bet in the future. It's worth exploring!

For what it's worth, !!omap is actually not necessary for your paths configuration.

Thanks for the heads-up. This is mentioned in the documentation (although I had to look for quite a while earlier today until I found it) and I have been using this feature ever since I started using beets. I've also implemented preservation of dict order by myself in some pyyaml-based parser before.

By the way, another problem with pyyaml is that it encourages adding parsers or representers for certain tags or Python objects to the global default loaders. ruamel.yaml additionally has a different API where you only need to modify your instance of a loader/dumper instead. (It also still supports the old pyyaml version for compatibility, however.)

Cool! Thanks for the extra details. That (avoiding global modifications to the loaders) does sound very attractive鈥攊t would be worth investigating a switch. I'm changing the title of this issue to make it about that.

Actually, it looks like pyyaml is finally getting some traction: https://github.com/yaml/pyyaml/pull/257 (also related https://github.com/yaml/pyyaml/issues/193, CHANGES). Of course, if you were using safe_load already, this particular issue doesn't affect you.
The global loader modifications are still there, however.

Initially the issue wasn't about prompting a switch, although I find it weird that pyyaml by default doesn't load !!omap. Because for me it does, although not into a dict (?).

In [1]: import yaml

In [3]: yaml.safe_load("""
   ...: !!omap
   ...: - this: is a
   ...: - map: but
   ...: - with: all keys
   ...: - in: order
   ...: """)
Out[3]: [('this', 'is a'), ('map', 'but'), ('with', 'all keys'), ('in', 'order')]

In [4]: from ruamel.yaml import YAML

In [5]: yaml2 = YAML()

In [7]: yaml2.load("""
   ...: !!omap
   ...: - this: is a
   ...: - map: but
   ...: - with: all keys
   ...: - in: order
   ...: """)
Out[7]: CommentedOrderedMap([('this', 'is a'), ('map', 'but'), ('with', 'all keys'), ('in', 'order')])

Also note that roundtrip loading (which preserves formatting and comments) can be disabled if you don't intend on writing the data again.

Cool; good to know. The lack of global state was what convinced me this was worth looking into, but I'm also intrigued to learn about the round-trip capability. We currently apply a few hacks to fake round-trippability; it would be nice to rely on a real implementation in the library...

With the current title this looks like a duplicate of #1702

Thanks! It does indeed.

Was this page helpful?
0 / 5 - 0 ratings