Pkp-lib: Upgrade to 3.0.0: pkey violation on review_rounds

Created on 25 Oct 2016  路  21Comments  路  Source: pkp/pkp-lib

Hello,

I am trying to upgrade an ojs 2.4.8 installation to 3.0.0. The upgrade process produces the following output:

[pre-install]
[load: upgrade.xml]
[version: 3.0.0.0]
[code: Installer Installer::syncSeriesAssocType]
[data: dbscripts/xml/upgrade/2.4.0_preupdate_usageStatistics.xml (skipped)]
[data: dbscripts/xml/upgrade/2.4.0_preupdate_usageStatistics2.xml]
[data: dbscripts/xml/upgrade/3.0.0_preupdate.xml]
[data: dbscripts/xml/upgrade/3.0.0_adaptTimedViews.xml (skipped)]
[data: dbscripts/xml/upgrade/3.0.0_adaptReferrals.xml]
[data: dbscripts/xml/upgrade/3.0.0_adaptBooksForReview.xml]
[data: dbscripts/xml/upgrade/remove_timed_views_bots.xml (skipped)]
[code: Installer Installer::migrateCounterPluginUsageStatistics (skipped)]
[code: Installer Installer::migrateTimedViewsUsageStatistics (skipped)]
[code: Installer Installer::migrateDefaultUsageStatistics (skipped)]
[data: dbscripts/xml/upgrade/2.4.0_postUsageStatsMigration.xml (skipped)]
[data: dbscripts/xml/upgrade/2.4.0_postUsageStatsMigration2.xml]
[data: dbscripts/xml/upgrade/2.4.0_postCounterMigration.xml (skipped)]
[data: dbscripts/xml/upgrade/2.4.0_postTimedViewsMigration.xml (skipped)]
[schema: lib/pkp/xml/schema/common.xml]
[schema: lib/pkp/xml/schema/log.xml]
[schema: lib/pkp/xml/schema/announcements.xml]
[schema: lib/pkp/xml/schema/scheduledTasks.xml]
[schema: lib/pkp/xml/schema/temporaryFiles.xml]
[schema: lib/pkp/xml/schema/metadata.xml]
[schema: lib/pkp/xml/schema/reviews.xml]
[schema: lib/pkp/xml/schema/reviewForms.xml]
[schema: lib/pkp/xml/schema/controlledVocab.xml]
[schema: lib/pkp/xml/schema/submissions.xml]
[schema: lib/pkp/xml/schema/submissionFiles.xml]
[schema: lib/pkp/xml/schema/notes.xml]
[schema: lib/pkp/xml/schema/views.xml]
[schema: lib/pkp/xml/schema/genres.xml]
[schema: lib/pkp/xml/schema/gifts.xml]
[schema: lib/pkp/xml/schema/mutex.xml]
[schema: lib/pkp/xml/schema/tombstone.xml]
[schema: lib/pkp/xml/schema/rolesAndUserGroups.xml]
[schema: lib/pkp/xml/schema/metrics.xml]
[schema: lib/pkp/xml/schema/views.xml]
[schema: lib/pkp/xml/schema/libraryFiles.xml]
[schema: dbscripts/xml/ojs_schema.xml]
[data: dbscripts/xml/indexes.xml]
[data: dbscripts/xml/upgrade/3.0.0_change_assoc_type.xml]
[data: dbscripts/xml/upgrade/3.0.0_change_assoc_type_metrics.xml]
[data: dbscripts/xml/upgrade/3.0.0_settings.xml]
[code: Installer Installer::migrateArticleMetadata]
[data: dbscripts/xml/upgrade/3.0.0_update.xml]
ERROR: Upgrade failed: DB: Duplicate entry '4-3-1' for key 'review_rounds_submission_id_stage_id_round_pkey'

The problem is that the entries are not unique at all regarding submission_id, stage_id and round.

# mysql ojs -e 'SELECT submission_id FROM review_rounds;' | sort | uniq | wc -l
1769
# mysql ojs -e 'SELECT submission_id,stage_id,round FROM review_rounds;' | sort | uniq  | wc -l
1787
# mysql ojs  -e 'SELECT submission_id FROM review_rounds;' | wc -l
35228

Am I right that the last output reveals some severe misshape in my review_rounds table? What went wrong and how can I fix this?

Thanks in advance,
Patrik

Most helpful comment

Hi all, for what it's worth we ran into this with a journal that we're upgrading at PKP as well. The issue was the same - duplicate entries in the review_rounds table. We bypassed this by deleting all but the oldest duplicate entries:

CREATE TABLE review_rounds_old SELECT * FROM review_rounds;
DELETE FROM review_rounds WHERE review_round_id NOT IN (SELECT MIN(review_round_id) FROM review_rounds_old GROUP BY submission_id, round);
DROP TABLE review_rounds_old;

We think this may have resulted from the original journal host using a different database system than MySQL. MySQL should have enforced unique entries in the review_rounds table, but the different DB engine may not have. Hope this helps.

All 21 comments

I fixed this by dropping all duplicates. The upgrade ran through but I am not sure if my fix is harmless or not.

Perhaps the duplicates were generated by a faulty migration script on our side. If you can rule out misbehaviour of ojs htis can be closed.

@pkel, I suspect it's the difference between 1769 and 1787 that's causing the problem. Could you take a look at those 18 records?

@asmecher, I have some details now.
First thing to observe is that there are 1769 submissions with review in our system. Few submissions had multiple review rounds. The number of second/third/... rounds is 18, which sounds valid. This explains the output of the two first commands.

Problematic is that we have >30000 entries differing only by review_round_id and thus containing not much information. They were probably added by a faulty script on migration from a different system to ojs 2.4.5.

An obvious question raises - how can there be thousands of duplicates in submission_id,stage_id,round while there is a unique index on exactly these columns?

stage_id was NULL for all entries and the upgrade to ojs3 sets it to 3. While multiple entries x,NULL,x are considered unique enough to not violate the index, mutliple entries of x,3,x are not possible. I consider this weird behaviour of mariadb, but probably there is a reason for that and some index configuration statement to avoid it.

Hmm, >30k entries differing only by review_round_id definitely sounds like a data problem to me. We haven't tested migrations on a particularly large scale (probably 10-20 with a decently wide variety of data histories), but I haven't encountered that yet. I suspect you're running into something particular; I'll definitely keep an eye out for it to appear elsewhere in the ecosystem.

Hi all, for what it's worth we ran into this with a journal that we're upgrading at PKP as well. The issue was the same - duplicate entries in the review_rounds table. We bypassed this by deleting all but the oldest duplicate entries:

CREATE TABLE review_rounds_old SELECT * FROM review_rounds;
DELETE FROM review_rounds WHERE review_round_id NOT IN (SELECT MIN(review_round_id) FROM review_rounds_old GROUP BY submission_id, round);
DROP TABLE review_rounds_old;

We think this may have resulted from the original journal host using a different database system than MySQL. MySQL should have enforced unique entries in the review_rounds table, but the different DB engine may not have. Hope this helps.

I had the same problem @jmacgreg. It is more common that we think. Maybe it is necessary to put in migration.xml file this rules to help others on migrations. I'm migration from version 2.4.5 to 3.0.2

@glaucomunsberg, thanks for the feedback -- I've scheduled this against the next OJS release (currently 3.1).

Same problem . I delete duplicate and pass.

Just reporting one more case... in two quite old jornals migrating from ojs 2.4.5 to 3.1.0-1.

This query returns the duplicated rows that need to be deleted:

    SELECT rr.* FROM review_rounds as rr WHERE rr.review_round_id 
        NOT IN (SELECT MIN(rrr.review_round_id) FROM review_rounds as rrr 
                GROUP BY rrr.submission_id, rrr.round);

Sorry @bozana and @asmecher to anyone again with this but I clean the duplicates in review_rounds running @jmacgreg queries before migration, and the upgrade still complains about duplicated entries in the review_rounds table.

I enabled debug and this is how upgrade.php exits:

-----<hr>
(mysqli): INSERT INTO review_round_files (submission_id, review_round_id, stage_id, file_id, revision) SELECT afm.article_id, rr.review_round_id, 3, afm.file_id, afm.revision FROM article_files_migration afm, articles_migration am, review_rounds rr WHERE am.revised_file_id = afm.file_id AND rr.submission_id = afm.article_id AND rr.round = afm.round 

-----<hr>
Query: INSERT INTO review_round_files (submission_id, review_round_id, stage_id, file_id, revision) SELECT afm.article_id, rr.review_round_id, 3, afm.file_id, afm.revision FROM article_files_migration afm, articles_migration am, review_rounds rr WHERE am.revised_file_id = afm.file_id AND rr.submission_id = afm.article_id AND rr.round = afm.round failed. Duplicate entry '373-48-452-1' for key 'review_round_files_pkey'
1062: Duplicate entry '373-48-452-1' for key 'review_round_files_pkey'
                                                        ADOConnection._Execute(INSERT INTO review_round_files (submission_id, review_round_id, stage_id, file_id, revision) SELECT afm.article_id, rr.review_ro..., false)% line 1051, file: /var/www/html/lib/pkp/lib/adodb/adodb.inc.php
                                                ADOConnection.Execute(INSERT INTO review_round_files (submission_id, review_round_id, stage_id, file_id, revision) SELECT afm.article_id, rr.review_ro...)% line  440, file: /var/www/html/lib/pkp/classes/install/Installer.inc.php
                                        Installer.executeSQL(INSERT INTO review_round_files (submission_id, review_round_id, stage_id, file_id, revision) SELECT afm.article_id, rr.review_ro...)% line  435, file: /var/www/html/lib/pkp/classes/install/Installer.inc.php
                                Installer.executeSQL(Array[111])% line  396, file: /var/www/html/lib/pkp/classes/install/Installer.inc.php
                        Installer.executeAction(Array[3])% line  265, file: /var/www/html/lib/pkp/classes/install/Installer.inc.php
ERROR: Upgrade failed: DB: Duplicate entry '373-48-452-1' for key 'review_round_files_pkey'

I double check this before migrate:

  1. Set journal in maintenance mode.
  2. Migrate from the original 2.4.5 DB from backup.
  3. Files are reachable: set permissions for public and private folders.
  4. Code is OJS 3.1.0-1

To be honest, I can put my hand on fire to ensure the DB is a clean 2.4.5... but I checked and it has the same number of tables than in a clean 2.4.x. Please, let me know if there is a way to be sure the DB is not half migrated.

I have been digging this during the last tree weeks and I can't find the way to migrate the journal.
If somebody has a clue, it will be really appreciated.

@marcbria, would it be possible for you to send me (bozana dot bokan at posteo dot net) the anonymized DB dump of your 2.4.x installation that you would like to migrate? -- I could then investigate...

Sure. Thanks a lot.

@marcbria, I found the problem here :-) It is a different one than that above: for some reason the revised_file_id and review_file_id were the same for the article 373. You can see it with this SQL query:
SELECT * FROM articles WHERE revised_file_id = review_file_id
I do not know how this could happen, maybe you can investigate that file with the id 452 -- is it an author revision file or a file that was send to the reviewer. You can then either fix it via the UI -- remove the file(s) and upload the correct ones, or fix it in the DB (e.g. I put NULL for revised_file_id in this case), but note, that the file is also in the files folder and in a appropriate sub-folder.
Because of that, because those two DB fields were the same, there was the problem when executing these two code lines in the upgrade script: https://github.com/pkp/ojs/blob/ojs-3_1_0-1/dbscripts/xml/upgrade/3.0.0_update.xml#L196-L197.
I did not get any further DB problem when testing the upgrade, but because I do not have the files folder, I cannot say if everything is OK with the upgrade in total -- because some upgrade functions work with the files folder and files movement that also has consequences in the DB tables... If you wish/needed, you can also send me your files folder and I can take a further look...

@marcbria, I took a look in the article_files table for that submission = 373 and it seems like there is no revision file there -- I think the revision files contain ...-ED in the file_name -- so maybe it is safe to set NULL for that field revised_file_id in the articles table...

Thanks a LOOOT bozana. :-)

This is one of the oldest DB in our sevice, with more than 10 years old so lots of things happen.
With your indications I can follow the thread and fix the issue.

I owe you a beer... I hope to meet you in Heidelberg. :-D

Yes, the old ladies (OJS installations) need special attention and care! :blush:
I am looking forward to meeting you in Heidelberg! :tada: :clap: :raised_hands: :blush:
And I am looking forward to having a beer with you there! :blush:

Yes, the old ladies (OJS installations) need special attention and care! blush

And they deserve it. ;-)

I'm always curious about the file OJS file syntax... is there an explanation somewhere?
For instance "373-48-452-1" means:

  • article ID: 373
  • file ID: 452
  • 48??
  • 1??

In the other hand, "review_file_id" is the file the reviewer send to editors/authors...
while "revised_file_id" is the file the author send with the suggested revisions isn't it?

maybe it is safe to set NULL for that field revised_file_id in the articles table

As Bozana suggested, setting revised_file_id to NULL fixed the issue and upgrade script finished it's job.
Reported as [solved] in pkp's forum.
Thanks a lot!

This issue is flagged against OJS 3.3 but the 2.x to 3.x upgrade process has been removed, so those still doing 2.x migrations will need to upgrade first to the newest OJS 3.2.x. I'm going to close this as a WONTFIX as it doesn't affect all users and there are workarounds documented here.

Was this page helpful?
0 / 5 - 0 ratings