Openlibrary: Creating Open Library records should trigger "already exist" checks

Created on 5 Jun 2018  路  7Comments  路  Source: internetarchive/openlibrary

Right now there are several ways to POST / create things on OpenLibrary.

ImportBot, Add Book, POSTing to /edit may potentially all produce different outputs and do different (or no) checks to see if Authors, Works, and Editions exist before.

This issue proposes it be a requirement that all methods of record creations should be routed through/to the (a single point of truth) same checks and prevent records from being created if they are presumed to result in a duplicate (i.e. return {"warning": "already exists", "match": "OL...W"}.

This change to the API should make it so Add Book, ImportBot, and any other author, edition, or work POST / create must run/obey these checks.

This change may provide an override param to either create or update an Author, Work, Edition, etc, if the creator is reasonable sure what they want to do. If the initial request results in (e.g.) a warning, one might add override=update to force an update or override=create to force creation of a new record.

Librarians Import 1 Bug

All 7 comments

Additionally a ~simulate~ (let's call it preview per suggestion below) param would be nice to test the results of a request without actually creating anything.

Similar issue for authors: #756

Rather than "simulate" users will be more used to "preview", but yes, there's much to be gained by doing this

@hornc is this fixed / can this be closed?

Current import APIs use catalog.add_book https://github.com/internetarchive/openlibrary/blob/master/openlibrary/catalog/add_book/__init__.py

which checks whether an edition is already in the system, and will add it, and a Work and Author if they do not already exist.

Currently the UI add book uses a different path. I thought we had an issue to deal with this but I can't find it right now. The closest I found is #1163

@hornc I added backlogged and refactor labels to this issue based on the thread. Let me know your thoughts

as in comment above

Current import APIs use catalog.add_book https://github.com/internetarchive/openlibrary/blob/master/openlibrary/catalog/add_book/__init__.py

which checks whether an edition is already in the system, and will add it, and a Work and Author if they do not already exist.

The add book UI also checks for existing editions:
https://github.com/internetarchive/openlibrary/blob/1aa081278c389cc9dce1c5a83c1eba62c98e013a/openlibrary/plugins/upstream/addbook.py#L261

At some point these should be combined as part of a greater refactor, but the checks exist.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Pratyush1197 picture Pratyush1197  路  3Comments

cdrini picture cdrini  路  4Comments

jdlrobson picture jdlrobson  路  5Comments

dcapillae picture dcapillae  路  4Comments

LeadSongDog picture LeadSongDog  路  5Comments