Beets: Album merging/completion

Created on 28 Feb 2013  Â·  33Comments  Â·  Source: beetbox/beets

This issue was automatically migrated from Google Code.
_Original author: [email protected] (April 29, 2012 18:26:17)_
_Original issue: https://github.com/google-code-export/beets/issues/380_

feature migrated

Most helpful comment

Any news on this?

I'ld like to see this feature on beets. This is a must-have kinda feature I think.
For example I have this partial album with tracks 1, 3, 7 and 8 but the rest are missing.
And let's say I have another partial album on an different directory with tracks 1, 2, 3, 4, 9, 10, 11.
Whilst importing if beets were to pickup the missing tracks (and maybe check for quality and replace the better one) then I'ld end up with 1, 2, 3, 4, 7, 8, 9, 10, 11.
Now tracks 5 and 6 seem to be still missing but that would be OK. This still is better than having less.

Currently if I have even 1 track beets skip it saying I already have the album. So maybe adding an Import Missing tracks feature would be wise as well.

All 33 comments

+1
Probably the most important missing feature!

This feature could be included as an option in the "duplicate album" prompt, alongside "Skip New","Keep Both" and "Remove Old" - a new option called "Merge".

The merge can be done by setting the album_id for each of the tracks being imported to the id of the existing album, putting those tracks in the database, and not adding another album entry. One would have to provide a way to resolve conflicts such as duplicate tracks, conflicting mb_albumid value, etc. but I think this is possible.

@sampsyo are there any obvious problems with this approach? If not I can start making this...:exclamation:

Yes, this sounds exactly right! Good plan, @udiboy1209. :sparkles:

Just to be clear: the combined album should probably be _re-tagged_ as a unit. That will (hopefully) resolve all the problems with conflicting information using the autotagger. The "as tracks" and "as albums" options show how this sort of thing can be done by creating a temporary pipeline.

This seems more complicated than I thought. Maybe because I am new to beets' internals.

Re-tagging the whole album altogether again is a good idea! But what would you do if autotagging is disabled?

Yep, it is a bit of a large project... let me know if you have questions about the architecture.

Fortunately, in non-autotagged mode, we don't do duplicate detection either, so this will never come up.

Sorry to be replying after such a long time! I was busy with something else!

So I have a few questions.

I added two functions to ImportTask class, one is

    def merge_with_duplicate(self,lib):
        existing_album = self.find_duplicates(lib)[0]
        if self.merge_conflicts(existing_album):
            # Add existing album's items to current task to retag again
            # and remove the existing album
            self.items += existing_album.items()
            existing_album.remove()
        else:
            for item in self.items:
                item.album_id = existing_album.id
            self.set_choice(action.ASIS)

and the other is merge_conflicts which basically returns true if any conflicts arise with mb_albumid or duplicate tracks exist.

The question is that is is safe to put existing_albums.remove() here? Is it even the right way to do something like this?

this function will be called from a pipeline.stage function called merge_with_duplicate defined in importer.py

Awesome! This definitely looks like the right direction.

Your suspicion was correct, though: we need to avoid doing the remove() call in this stage (presumably this is called as soon as the user decides to merge, right?). The goal is to have all the database manipulation stuff performed atomically in the apply_changes stage. This way, if something goes wrong, we don't end up with a database containing _zero_ of the parts of the albums: it should only be possible to end up in the "before" or "after" state, not in the middle.

As a template, you can see that the call to task.add() here:
https://github.com/sampsyo/beets/blob/master/beets/importer.py#L1039
atomically (inside a transaction) both removes old duplicate items, for re-imports, and adds the new items:
https://github.com/sampsyo/beets/blob/master/beets/importer.py#L536
I _think_ this new feature could also work by just making sure that remove_replaced finds all the old pieces of the album. Note that by removing all the items from the old album, we automatically remove the album itself.

Thanks again for looking into this, @udiboy1209!

Hmm.. Letting remove_replaced remove the old items might work just by adding the old items to the current task. The duplicate paths will be detected automatically. But will it also remove the old album entry in the library? How is this handled when re-importing albums?

Yeah, hopefully we can reuse this existing functionality!

Fortunately, the library takes care of cleaning up "empty" albums when the last constituent item is removed. Here: https://github.com/sampsyo/beets/blob/master/beets/library.py#L464

So this should mostly happen automatically. The only exception I can think of is this edge case:

  • The library contains an album with tracks A, B, and C.
  • The user imports items D and E and wants to merge the album.
  • The auto-tagged recognizes this as a different album, but track A is not present on the new matched album (a partial match). The album only has B, C, D, and E.
  • Then we will remove tracks B and C as duplicates and add back B, C, D, and E as a new album.
    This results in track A being left alone in its own album with the remaining tracks getting a new album alongside it.

I think this case is edge-y enough to ignore. In any case, it is probably a good idea to leave A in the library to prevent unintentional data loss!

awesome software guys...greatly looking forward to this merge feature!

I run into this quite a bit! It would also cool if the duplicates plugin offered a merge option as well.

+1 to this enhancement for beets

I have made quite a few mistakes using the ('Skip new', 'Keep both', 'Remove old') options at import time with mixed up albums and some files already in my library. It hasn't always been clear to me what is the best option to choose - perhaps just my stupidity or perhaps something other beets users have also found. In addition, I think I have deleted some files by mistake when doing import / clean up of existing files using "beet import -L" when using the 'Remove old' option.

I would like to see a new 'Details' option added. I think this could sit alongside the merge functions being described above? So the options would now be

('Skip new', 'Keep both', 'Remove old', 'Details')
Pushing 'D' for details would show the following output.

Skip will

  • ignore the 7 tracks in folder /Users/someuser/Downloads/Awesomesingers/First\ Album\ 320kbs
  • will retain the 4 tracks in folder /Users/someuser/beetMUSIC/Awesomesinger/DebutAlbum

and so on for 'Keep Both', 'Remove Old',

The user query would then be repeated at the end but would this time have a 'More Details' option which would give the user the option of repeating the above schema but this time with each individual /path/track listed.

Sorry if that seems overkill but would be good for import and clean up of some fragmented albums that i have following recovery of part of a hard disk.

:thumbsup: In the meantime, is there any work around to allow us to import a song to an existing album?

You can always remove and re-import!

Cool, Thanks.

This feature would be awesome. Hope it ends up in beets at some point.

+1000. I think this is absolutely needed for the move configuration to be any useful.

Just came across this issue. This is addressed in 3be593693d3f7f60eb07e8445ec669fe9ce84c68, at the duplicates plugin level (so, not within the importer). Might still be of use, though.

From what I'm reading I can imagine the `edit' plugin might come in handy in the process of (optionally?) applying some final tidying edits after the merge (if chosen) since this might have undesired consequences as far as the consistency of tags is concerned.

If that sounds useful to you guys it should probably become a new issue, but for now it's only a thought.

Any news on this?

I'ld like to see this feature on beets. This is a must-have kinda feature I think.
For example I have this partial album with tracks 1, 3, 7 and 8 but the rest are missing.
And let's say I have another partial album on an different directory with tracks 1, 2, 3, 4, 9, 10, 11.
Whilst importing if beets were to pickup the missing tracks (and maybe check for quality and replace the better one) then I'ld end up with 1, 2, 3, 4, 7, 8, 9, 10, 11.
Now tracks 5 and 6 seem to be still missing but that would be OK. This still is better than having less.

Currently if I have even 1 track beets skip it saying I already have the album. So maybe adding an Import Missing tracks feature would be wise as well.

If you're interested in a feature, please consider helping implement it!

Also, with regard to @nomadturk's comment:

Let's say that an album is supposed to have 6 tracks.
If I have 1, 3, and 5 currently, and I import 4, 5, and 6, then a resolution needs to do two things.

First, it has to merge the tracks that only exist in one or the other sets, so 1, 3, 4, and 6, I DEFINITELY want to keep.

After that, though, I'd want to compare 5. Are they different bit rates? Is one a better musicbrainz match? Different lengths?

So I'd want merge resolution to be two stage: union the album as a whole, and then deal with duplicate tracks individually.

The way this would work according to the current proposal is to reuse the existing importer logic. That means both sets of tracks would get thrown together into one big basket, and that would then be re-tagged as the same album.

If you have duplicate tracks, then the "best" one will be picked according to match similarity. Interactively resolving track duplicates would be the purview of something like #154.

Personally, I'd rather NOT automate the "best" pick...

And edit, a la #154 is not the goal...being able to make a track-by-track-non edit comparison and choose one, the other, or both is preferable.

Right. I'm just saying that manually choosing resolutions like this is actually orthogonal: it can come up in ordinary imports too if you happen to have too many tracks in a folder.

We actually have the "edit option" for editing metadata. (See the edit plugin.) The reason I bring up #154 is because the original intention there was changing the order (i.e., the matching) of tracks, which would also require dealing with unmatched/duplicate tracks.

Gotcha, ok.

Hi, has any progess been made on this feature? I would be willing to help/implement it, but i dont want to do duplicate work or step on any toes. Thanks in advance

I had tried a few things, but none were particularly useful. Also it was three years ago and the codebase may have changed a lot since then... So you start afresh for this feature.

Incredible! Looks like I can get back to using beets (something I'm very glad to be able to).

I'm wondering: how about release groups? They also appear in MusicBrainz and group albums that are basically reissues of one another (stuff like Harmonia Mundi Gold). Could it be possible to extend what has just been implemented with releases to release groups? In many cases you do not want to have a duplicate of the music just because one is the original version of 1987 and the other one is the reissue of 2010 because now it is a legendary recording or because one wants to make money again with it (I'm throwing numbers at random, but there are countless examples). Some people might prefer to keep them separate for the sake of completeness, but I believe this is a minority. So far what I did was to use beet dup and set the mb_releasegroupid as a key but it is not very satisfying because the album gets sometimes cut into two complementary folders. I don't know if it is a problem many people experience, but it seems to be quite similar to merging albums.

And thanks for the awesome work!

With “merge all”, it appears that if you attempt to import a song that is already in the album you are merging, beets will rename the song to a totally different song randomly and incorrectly. You will think you have a song when in fact it is a duplicate of another song.

For example:

Step 1: Import Track 01 and 02 of Album A (which has 5 songs total). Beets imports the two songs into one album correctly.
Step 2: Import Track 03 of Album A. Beets offers you the “merge all” option to combine with the 2 songs imported in Step 1. Beets correctly imports Track 03 and combines it into Album A created in Step 1. There are now 3 of 5 songs of Album A in the library.
Step 3: Import Tracks 03 and 04 (note 03 is already in the library and 04 is new). Beets offers you the “merge all” option. You select “merge all” and beets renames the Track 03 you are importing to Track 05 incorrectly and imports track 04 correctly. Now beets thinks you have a complete Album of all 5 songs in Album A, when in fact songs 03 and 05 are the same song.

Indeed; the current feature combines the tracks; it does not try to deduplicate them.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Stunner picture Stunner  Â·  3Comments

Moonbase59 picture Moonbase59  Â·  4Comments

Freso picture Freso  Â·  4Comments

ItsKonix picture ItsKonix  Â·  4Comments

udiboy1209 picture udiboy1209  Â·  3Comments