Pkp-lib: [OJS] Versioning for published articles

Created on 8 Dec 2016  ·  89Comments  ·  Source: pkp/pkp-lib

Hi @asmecher, @NateWr, @bozana and everybody interested :)

I am currently finishing the work my colleague @felixgruender did to enable versioning of published articles and files in OJS. Because this is a big topic with many aspects to concider and a lot of code I separated the task into tree steps and wrote descriptions at my github wiki.

  1. file versioning (display file versions at article page) description
  2. article metadata versioning (allow user to create versions of the metadata) description
  3. connection of article metadata versions with file versions. (work in progress)

There could be more steps in future. For example to make the versioning work with galley DOIs I am thinking about adding versioning for galleys (the file versioning would then be replaced). But this should have no effect on my ideas for the last step. And I would like to discuss these with you.

The original solution adds the versioning at the metadata tab and connects files with the article metadata version during the file upload. But the concept in OJS changed: there are now at least three places that needs to be changed to display version information (metadata and identifiers, schedule for publication, galley files) and there is no file metadata page anymore.

To simplify the GUI I would like to add a central overview of the existing versions and put all versioning actions at one place. This could be a grid that lists all versions of the article. Following information could be displayed per version:

  • (optional) information about the version (change log, creator, etc.)
  • article metadata (done), identifiers (todo) and publication (publication date, etc.) (done)
  • galley files (todo: grid would need update to display only files of this version)

img_0250

The original places of article metadata and identifiers, schedule for publication and galley files could display only the info of the current version. A link could lead to the versions grid to get information about old versions.

One big question is where to put the grid at the backend. I see the following options (neather one perfect):

  1. at the top level beside metadata (does not really fit with metadata)
  2. as a new workflow step after production (is versioning logically a new step?)
  3. at the workflow step production above galleys (maybe to much info at the page)

Any other ideas?

Conclusion:

Major Feature

Most helpful comment

Hi everybody, I created pull requests to add the basic versioning functionality to the master branch of OJS: https://github.com/pkp/ojs/pull/2103 and https://github.com/pkp/pkp-lib/pull/4007.

There are still some bugs (see below) but I would suggest merging the current state to prevent me from rebasing for the hundreds time. :weary: :wink: I will work on the fixes next week, and would be very happy if somebody else could have a look at some as well. :angel:

This is working now:

  • changing the database structure for versioning via tools/upgrade.php upgrade
  • you can create new versions via button at the backend (production stage)
  • authors and galleys are automatically copied to the new version (displayed in a new tab)
  • you can change the metadata of the new version (not the old ones) and publish the new version when ready
  • display of the versions in the frontend

These bugs need to be defeated:

  • [x] new submission wizzard seems to have some problem with a version field in a where clause
  • [x] authors strangely loose their first and last names when new versions are created
  • [x] adding and deleting galleys is not yet working
  • [ ] versions are displayed at the submissions grid but there should only be articles
  • [x] locale ##submission.production.notPublished## does not resolve

All 89 comments

Hi @lilients, thanks for posting about this.

I think you're on the right track, raising concerns about the way the GUI for this data is split up into three places. I think this causes some unique problems and I'd like to explore that a bit more to see if we can find a more robust way of guiding the user through this process.

My main concern is that we're asking the user to make a decision about versioning in a series of separate save operations. Go to metadata, make changes, decide whether they should be versioned and then click save. Go to galleys, make changes, decide whether they should be versioned and then click save. And so on.

I think this is an approach that's going to be prone to human error. And it will also inject additional cognitive overload to the initial publication process, which may be used for a minority of articles.

I'd like to see a UX approach in which the editor is automatically versioning by default, and has to go out of their way _not_ to version changes. Here's my first take on what that could look like.

  1. The journal opts into versioning under Settings > Distribution > Versioning. Once they do so, the following workflow steps occur by default.

  2. An editor schedules a publication normally. When that issue is published, OJS automatically creates a version of that article.

  3. From the date of publication, a new alert message appears at the top of the submission workflow advising the editor that they are editing a new version of the article. If they wish to make changes to a previous version of the article, they must go to a separate UI.

  4. Any changes the editor makes after publication are saved to a new version automatically. These changes are not reflected on the frontend until the editor publishes an updated version.

  5. To publish an updated version, the editor clicks a button under the Publication stage, "Publish New Version". This launches a UI which shows them what has changed and advises them they will be publishing a new public version. They click Publish New Version to confirm their decision.

If an editor wishes to modify a published version without actually creating a new version, they must click on a new button in the Publication stage: "Edit Published Versions". This will launch the UI you've sketched out @lilients.

The editor can click on a prior version to load that version in the Submissions workflow. When this version is being edited, a large banner notification will appear at the top which informs the editor they are editing a previously published version and that any changes they make will be displayed immediately.

Does this sound like an approach that isn't a nightmare to handle on the backend? Is the workflow clear or would anyone benefit from mocked-up screenshots?


To address some of your other specifics:

URLs

I like the URL approach. I think the "current" version should _also_ be available at .../article/version/[articleId]/[version] so that people who want can make permanent links to a version. And canonical link tags should be used to point to the current version at all times.

Article Page

I'd like to see a small note beside the Published date, like:

Published
Jan 21, 2017
_This article has been revised. View all versions._

That could be just an anchor link that jumped a section below as you outlined, but showing the date rather than the title:

image of Version History

I think only the currently-viewed versions' galleys should be displayed at any one time.

Finally, when any version other than the current version is being viewed, a banner notification should indicate that an out-of-date version is being viewed. Something like "This article has been updated. View the latest version".


Hope this is helpful and not wildly out-of-scope. I know there's been some discussion about revisions within the team, and it's a really tough issue.

I'd like to see a UX approach in which the editor is automatically versioning by default, and has to go out of their way not to version changes.

I support this and all of your specific comments at the end. This is how I would imagine versioning of metadata and galleys to look like, both from a reader's and from an editor's point of view.

Hi @NateWr,

thank you for your reply and your concrete feedback. I also think creating a new version by default is a good idea. But I found it hard to explain to my colleges and I fear it is confusing for users who only want to have a look at the current state and do not want to create a new version.

I understand your sketched workflow and think it is a very good approach - I especially like the idea of being able to edit a version and making it available later. This simplifies the connection of metadata and files a lot.

If an editor wishes to modify a published version without actually creating a new version, they must click on a new button in the Publication stage: "Edit Published Versions". This will launch the UI you've sketched out @lilients.
The editor can click on a prior version to load that version in the Submissions workflow.

I fear that users will not understand this - different information at the same place (production stage with information about different versions). I now also see a similar problem with the grid I sketched. The same information will be displayed at different places (e.g. galleys at the production tab vs. galleys in the grid and metadata at the top level and metadata at the grid). And there is always the metadata tab that is kind of detatched from the version info.

The solution I see now would be similar to the review rounds at the workflow stage "review". Each version would be displayed at its own tab. Files and metadata can be accessed and changed at one place. The metadata link at the top level could be hidden or it could display the metadata of the current version with a link to the old ones. The discussion could be used for the whole stage as in the review stage.

img_0257

Currently only the recent version can be edited. Do you think that is ok, or would the users want to change all versions?

Thank you for your feedback on the URL structure as well. This might get tricky once there will be files connected with the article version. But we can discuss that later on. :)

I'd like to see a small note beside the Published date

Thank you very much for this idea. I think it fits much better.

I think only the currently-viewed versions' galleys should be displayed at any one time.

Yes, the file versioning might be a lost cause. ;)

Hope this is helpful and not wildly out-of-scope. I know there's been some discussion about revisions within the team, and it's a really tough issue.

Yes, thank you very much for your thoughts and ideas! It helped a lot. We did do a lot of discussion. But I think it is needed and is getting better every time. :)

Conclusion

  • [ ] Would a tab view similar to review round be a solution to display versions at the production stage?
  • [ ] Should we allow to edit all old versions or only the recent one?

Hii @lilients, I fear that your approach will lead to the wrong behaviour - if I can change files and metadata (of the most recent version) easily but have to press a button to create a "new version", people might just update their most recent version of that article instead of creating a new version. But in most cases creating a new version is the correct approach - this is why adding versioning to OJS/OMP is so extremely important. The software should make it as easy as possible to do the right thing. Changing already published articles is not an option that should be encouraged by the software.

We should not fear confronting users with truths like these. Publishing is not a game, and journal editors need to know what they are doing. So I'd rather see an approach that redirects efforts to edit files and/or metadata of published versions to the "new version" option. It should be hard, not easy, to change something after publication date. (@asmecher, @stranack : this will also need good documentation).

(Maybe it would be a good idea to also introduce a standardized "corrections" feature for minor corrections that change something in the article without creating a new version, like a correction notice for an image or reference that someone forgot to include during production.)

Maybe we could support a MediaWiki-style "this is a minor revision" distinction that would lead to the revision being tracked, but receiving lesser consideration? I'm thinking about e.g. whitespace removals, typo fixes, etc. that are perhaps important to track but definitely shouldn't receive full "revision" status.

Hi Alec, but most journals adhering to a more formal publication ethics will consider these changes not as "minor" - you will want to avoid that two people linking to the same article (version), using the same DOI etc., do not talk about the same works. This is why I think versioning is so important, so that journals can update/correct articles without acting against publication standards. But the most important thing is to educate users so that errors are dealt with before publishing the first version. We might not always be happy with the way things are at the moment, but there is a difference between scholarly publishing and less formal ways of publishing on the web. And frankly, I definitively want to be able to link to stable versions of articles that are not edited after publication date.

Adding something like "Corrections" might be a good compromise for minor editing/minor versions. But even then we would need something standardized for displaying the corrections. This would IMHO not include changes to metadata or galleys, the corrections would be something like an add-on to the published galleys.

How about if we were to track all revisions (including minor ones), thus linking would be stable, and expose all revisions of published content -- but allow for expand/collapse of changes that are flagged "minor"?

Hi everybody,

I would not differentiale between different types of corrections. I concur with @mtub that _every_ change should be a new version. Even a typo could be quoted and would lead to wrong quotes if corrected.

I think we could add the default feature to my last proposal. After the publication of a version a new tab will be created by default, so that users will edit a new version. To view/edit old versions they would need to click on the previous tabs.

@stranack, your input on this would also be valuable!

Great discussion. It looks like we've got two things which feel technically similar but may have different conceptual bases or use-cases. I'll try to outline those to see if we can spark some further thinking along these lines.

  1. Journals want to make a significant revision to an article, one which should be prominently displayed to end users and may include substantial changes to key article content and metadata. (Thinking about OMP, this seems even more conventional for monograph editions.)

  2. Journals want to make minor revisions to an article, which might include typographical errors or updates to metadata (like keywords). These are consequential and should be tracked. But journals may not wish to make a big public notice about them.

The workflow I sketched out was definitely directed at the first use-case. The idea was that the second use-case would be filled by editing versions, and these could or could not be tracked.

I wonder if it makes sense to consider different technical solutions for each of these cases. To me, the first use-case sounds like we want a separate copy of the article in the database. Something so that we can present different article representations on the frontend and in the workflow. But the second use-case sounds like we just want to diff changes.

Imagine how amazing it would be if we had a full textual representation of an article version, and we could just diff it for minor changes? We could then store an audit log of minor changes with minimal impact on the database, and we'd be able to use existing libraries for working with diffs.

Of course, we're not yet at full textual representation. But how close is our XML exporter to being able to produce something resembling an XML representation of an article version that could be diffed?

That wouldn't pick up differences between article galley files. But it could register new file paths, thus recording the fact that a change occurred. If this were combined with preserving the former files and preventing overwrites, it could be an effective audit log.

Thank you @NateWr. Yes, I think the two use cases make sense.

  1. The article metadata versioning should be a solution for the first use case. Metadata like the title, authors, abstract, etc. will be stored for each version. Each article metadata version and each attached file should also have a different DOI (I will implement that soon). (I am currently working on the connection of the metadata version with the file. The sketched workflow should solve this problem.)

  2. Minor changes could be allowed at the recent version and would mean the editor overwrites data without creating a version - without creating new DOIs. This is very problematic and there should be warnings. But I think it should be allowed for (the recently) published version/s.

I like the idea of displaying diff changes and marking significant revisions of an article but I think those are additional features to the use cases you described. Unfortunatly my timeframe is limited and I need to finish the versioning as good as possible as quickly as possible. But I think we could add both features later on:

  • A first step to the diff display might be the display of the file history I already implemented (A file versioning is already implemented in OJS - all uploaded files and versions are stored at the server. The addition simply displays previous versions of files at the article page). Later on a new feature could compare those file versions and display diffs. I think the file history could be displayed in addition to the article versioning - Maybe @NateWr has some ideas how.

  • The article metadata versions could be added by tags like "major" or "minor". This way OJS could handle the versions differently. For example major versions could be displayed at the home page of the journal. But I think both should get their own version of article metadata - and especially DOIs. So this would be an addition to the use case 1 and not another implementation.

I will try to describe the planned concept of versioning in OJS and store the decisions of this discussion here:
https://github.com/lilients/ojs/wiki/Versioning-in-OJS

For now I would concentrate on getting the first use case finished. I need to display those article metadata versions at the backend and connect them with the file/galley versions. Any feedback on the tab view of versions at the production stage? :)

I totally get that you've got a deadline on this. We'll probably have to go through a slower development process to make sure we've got things right before rolling it out to everyone. Are you building this as a plugin for now?

Any feedback on the tab view of versions at the production stage? :)

I'm a little wary of the sub-tab approach, mostly because of the problem whereby title and metadata live outside of these tabs. But I think this is the result of a more general problem we have where we don't really have a clear presentation of a published object.

When published, an article just kind of sits in its workflow. And what constitutes the published object is pulled together from various things scattered around the page (metadata, galleys, participants, issue assignment).

I think that a UI for versioning would probably become a lot clearer to us if we had a view of a published object. We've talked about this as a final publishing step before: presenting a kind of confirmation view of all article core data, metadata, galleys, authors, DOI, and issue assignment with a big button that says, "yes, publish all of this I'm looking at right now".

This is probably out-of-scope for what you need to do within the time you've got. But it might be something for us to consider as a target for versioning down the road.

In the meantime, I don't really have a better idea than the tabs you present. No matter where the versioning actions are placed, they're going to be alienated from some of the customary tools for managing them.

I think it is a very good approach to think things through before implementing. ;) And I appreciate your feedback a lot!

Felix started the versioning at the core - not as a plugin. You can have a look at the code here: https://github.com/lilients/ojs (most relevant for the article metadata versioning is this commit https://github.com/lilients/ojs/commit/dd0757060a3ee0703e7410fc6a07cf7690e0c831). I am always rebasing to prevent divergence. I hope to get to a point soon where I can create a pull request.

I came to the same conclusion like you: There is no ideal way of presenting versions because of the scattered elements of a published article. But I hope we can find a solution that is usable and can easily be ported to future concepts.

The idea of presenting a published article with all published data in one place sounds interesting. Would that be part of the workflow or outside?

I think the versioning should be part of the workflow. It might be useful to be able to use some workflow elements - like storage of files and discussion - for new versions.

I tried to put all metadata belonging to a version inside the version tab. So this would be a step into the direction of presenting all data of published articles in one place. And maybe it could be ported to a future concept...

Another way to implement versioning could be a new workflow step after production. But the problem with the metadata at the top level would remain and there would also be doublings with the galleys.

So for now I would go with the tab solution inside the production stage.

Hi @NateWr and everybody interested,

I implemented the tab view at the workflow tab production. It looks like this:

img_0257

Now I have to add the functionality to create a new article version. I see different approaches at the GUI and would like your advice.

Open question 1: create new version by default or with a button?

Option 1: create new version by default when scheduled for publication

This is what you see before scheduling for publication:

Afterwards there will be a new unpublished version.

Might be confusing for users - especially if they do not want to create a second version and there will never be a second version, then there will always be two tabs - one unused.

Option 2: button "new version"

img_0260

This way only the current version is displayed. To create a new version users would need to click the button "new version".

Open question 2: allow editing old versions or only the current one or none at all?

Option 1: Allow changes at published versions but add an alert note when user wants to do so.

Option 2: Disable changes when a version has been published. This way users would be forced to create a new version instead of editing a published one.

I would prefer the button solution with no option of changing published versions. I think that would be less confusing and still preventing the user from changing already published stuff.

What do you think? Do you see other options?

Looking forward to your feedback and ideas.

Best,
Svantje

Hi Svantje,

if published versions cannot be edited (Question 2 - Option 2), the button solution for creating new versions (Question 1 - Option 2) is fine, I think.

Hi @asmecher, @bozana, @NateWr , @mtub and everybody interested,

as I mentioned before I think we need to add a versioning for galleys as well. Mainly because there are DOIs attached to galleys and statistics uses them. I would like to check my approach with you before implementing.

Currently there are versions of submission metadata stored in submission_settings like this:
submission_id, submission_revision, .... Functions that access this table has been adapted as well as export and import functions. On the todo list are DOIs at article level.

Currently files are connected with galleys (1-1) and do have DOIs assigned to them. The galleys are now attached to submission versions. As of now for each submission version a new galley is being created automatically. But this means that there is no connection between the galleys. This enables a different DOIs for each version but seem to be a problem for the statistics. So I think it would be convenient to add a versioning for galleys simultaneously to the verisoning at submission level. This would look like this:

  • The table galleys and galley_settings would be added with "galley_revision" and the primary key changed to "galley_id" + "galley_revision"
  • DOIs and publisher ids would be stored for galley versions - maybe the galley version could be added as variable to the DOI syntax
  • Statistics would cumulate the access to galley versions based on the galley id
  • Export and import needs to be updated to cope with galley versions
  • Files would be automatically copied into the new galley version and are getting a new file id (so no connection between "file version" here - this means the file versioning wouldn't be useful at the article page anymore, but maybe we do not need it ... if we do, it could be reimplemented at galley level in future)

Is this ok with you? Do you see problems or aspects I missed?

Thanks for your thoughts!

Best,
Svantje

Hi everybody,

I refined the database structure simultaneous to Felix versioning of submissions (storing info that needs to be versioned in settings):

submission_galleys (all information that will not be versioned):

  • galley_id, submission_id, locale, label, seq, is_approved (leftover from OMP?) - moved: file_id, remote_url

submission_galley_settings (information that will be versioned):

  • galley_id, version (new), locale (seems to be double - do we need it?), setting_name (doi, publisher_id, new: remote_url), setting_value, setting_type

submission_galley_files (new table to connect galley versions with files, does not fit in submission_galley_settings and cannot stay in submission_galleys because needs to be versioned):

  • galley_id, version, file_id

The version number of the galley is the same as submission_revision. This means there is no real galley versioning but the galleys are connected to an article version. This way the URL structure can be simplified and looks like this:

  • current article: article/view/[articleId]
  • current galley: article/view/[articleId]/[galleyId]
  • old article: article/version/view/[articleId]/[version]
  • old galley: article/version/view/[articleId]/[version]/[galleyId]

  • [ ] Would users want to change labels of galleys between versions?

Feedback and questions are welcome.

Best,
Svantje

HI @lilients,

Sorry for the lack of feedback on my part. Your screenshot looks good, with the tabbed versions. I agree with @mtub that using a tab named "New Version" is better than just naming a new version. A couple thoughts:

  • On all version pages except the first, can we change "Schedule for Publication" to "Publish Updated Version" or something like that?
  • We have two areas where "Metadata" exists: above the submission title and in the version tab. Is there anything in the metadata panel linked above the submission title that is not in the metadata panel linked in the version tabs? If not, then I would suggest we remove the Metadata link above the submission title and just keep it under Production. That may be confusing for users at first, but I think it will reduce confusion in the long-term by keeping it under the Production sphere of interest.
  • How hard would it be to make the versioning UI an opt-in checkbox under settings? I know the team is keen to get versioning merged into core as soon as possible. I'm a little bit nervous about exposing the additional complexity to all users. An opt-in setting would allow us to ship this soon while giving us some more time to tinker with the UI before enabling it for everyone. cc @asmecher on this, in case you think that's a terrible idea.

Hi @NateWr ,

thanks for your feedback. Opt-in is already implemented.
I think I can rename the button "Schedule for Publication" if versioning is enabled.
Currently the upper metadata window displays the metadata of the most recent version. The metadata of old versions can be viewed at the production tab. Moving the metadata completly to the production stage is a big decision I guess. I think we could change that later on.

@ everybody Any feedback on the database stuff? :)

Best,
Svantje

Hi @asmecher , @NateWr , @bozana and everybody else,

I added the galley versioning as descibed above and made some GUI changes (Description). Works fine. :-D

Cleaning up I found another problem. For the article versioning the date_published has been moved from published_submissions to submission_settings. But there are some functions that use the date in published_submissions for ORDER BY, MAX and MIN . I see these options to solve this problem:

  1. remove ORDER BY and MAX and MIN from SQL and implement it only if necesarry in php with the latest/first date from submission_settings
  2. keep date_published in published_submissions and update it with date of the first version or the date of the latest version

I would try to go with option 1 because I think it would be cleaner. Otherwise the publication date would be double in the database. Is it really necessary to order the published article by date? Would it be sufficient to order only by sequence?

What do you prefer? Do you see other options?

Best,
Svantje

I think that for such kind or ordering we could consider the date_published of the first article version -- if I see it correctly those are functions like "get all published articles of a journal" -- I think that for these cases the date when the article is first published is 'asked'/'important' and not of the later article versions i.e. changes. Thus I would try to consider that in the SQL if possible i.e. to take and order by the setting_value from submission_settings where submission_id = ?, version = 1 and setting_name = 'date_published'. Would that be possible?
The sequence attribute/column is relative to the section and issue, which is then not relevant if we want to get all published articles for a journal, for example.
Also, the published_submission_id could not be chronological (e.g. if a journal imports back/older issues after it has published some newer issues), so we cannot rely on that either.

I added the galley versioning as descibed above and made some GUI changes (Desciption). Works fine. :-D

The frontend display of versions looks good. What about adding the version name to the date? I'm thinking about cases where someone might release two versions on the same day. It would be good to have:

Version 3 — 2016-02-03
Version 2 — 2016-02-03
Version 1 — 2016-01-02

Thanks for your brainpower!
@bozana: brilliant idea! Seams to work as well. 💃
@NateWr: also a very good idea. :) I will add the version number to the date.

Done that https://github.com/lilients/ojs/commit/42b5fcea62969f9c7521a00407d0939d367fbcab and https://github.com/lilients/pkp-lib/commit/ee16e4ef07fb0294ddb5722849584a465dddc858 ... next topic to be discussed: URL structure.

We already discussed the principles via mail:

  • URLs shouldn't be ambiguous (i.e. we shouldn't have to guess at whether a URL part is a revision number, custom file ID, etc)
  • Old URLs should continue to work
  • If possible, we should avoid introducing more separators etc. into URL parts

I would propose (and already implemented ;)) this structure:

  • current article: article/view/[articleId]
  • current galley: article/view/[articleId]/[galleyId]
  • old article: article/version/[articleId]/[version]
  • old galley: article/version/[articleId]/[version]/[galleyId]

This way the view argument gets replaced by version for old versions (to prevent ambiguous structures) and in the ArticleHandler the function version() redirects to the function view(). Do you think we could live with that? Or is the view argument crucial?

I like having the version in the URL. Is the current version always available at /article/version/articleId]/[current-version-number] as well? I think that would be good, though we should add a canonical tag to that page that points to the article/view/[articleId] location.

The benefit of having that is that, if I want, I can generate a link to the current version of an article that I can trust will always point to that version, even if it's later updated.

I would agree with Nate... Maybe it would be good to always have the version in the URL: if a user reads an article and then cites it with the URL article/view/[articleId] the URL could be wrong a few month later, when the new article version appears under that URL i.e. the URL of that article changes. Also for DOIs/URNs -- the URL should not change with the version change, I think...

Yes, the current version is available at the "default" URL and also at the URL with the version number.
I also think that the DOIs/URNs should be registered with the URL with the version number, so they do not need to be changed later. The "default" URL is the same as before in OJS (I think this is important for backward compatibility) and will always lead to the most recent version - this is the same concept as wikipedia uses. If you want to cite a specific version you would need to use the URL with the version number. We need to think of a good place to display this URL. Currently the "version history" is only shown, when there are more than one version. Maybe it would be enough to change the URL in "How to cite" to the version URL?

No, I think it's good the way you've got it @lilients. If someone really wants to surface the version URL before there are multiple versions, they can do that themselves.

:+1:

This looks so good now… @lilients, this is amazing stuff. And look at how well documented this discussion is, the whiteboard drawings are my favourite. Given that this is a huge step forward for OJS and that it brings a few changes to the Look&Feel, I really appreciate how openly the way it has developed is documented here. Congratulations!

Minor suggestion, @lilients: In journal settings, maybe be more clear about what the versioning option does. Rephrase to "Enable optional article and galley versioning for this journal."

And maybe add another .description string with more information on what this option does. @NateWr might have UI ideas ;)

Thanks for the demo of the versioning yesterday @lilients. I had one brief note on language I forgot to mention: instead of "Edit metadata for this version" can we say "Edit Version Metadata"? I think that will fit the button on one line -- at least in English, no promises for German. :)

Also, just a note before this issue is closed out: we need to make sure versioning is displayed properly in the Bootstrap3 theme and the Manuscript child theme.

@mtub and @NateWr: thanks for all your input and ideas! 🎈

I improved the versioning settings and added an info text for the journal (see description).

@lilients, could you maybe list and manage the list of PRs here, in a comment, so the we/the users can quickly access them?

Yup, yup, great! :-)))

@lilients, I went roughly through everything and commented/asked...
I would still have to take a closer look at the PublishedArticleDAO changes -- how the functions there are used elsewhere in the system and if we should not always return the latest submission revision there. At the moment I think that we should do that, but... lets take a closer look...
Else, also two general questions:
-- when I delete a galley in the latest version, it will be deleted in the older versions as well. Is this intended to be so?
-- the metadata of old versions cannot be changed, but the metadata behind the button "Publish this Version" can be changed for old versions. Is this intended to be so?
When this work is totally finished we will see to implement the very first versioning in OMP too, so that these changes do not break OMP. (However, the application specific classes and functions (e.d. ArticleGalleys, PublishedArticles,...) should not be used in the pkp-lib.)
Thanks a lot for all the great work and contribution!!!

@bozana Thank you for your feedback! 🎉

-- when I delete a galley in the latest version, it will be deleted in the older versions as well. Is this intended to be so?

No, I will change that. I think we should not allow changes at old versions at all, so no deletions as well.

-- the metadata of old versions cannot be changed, but the metadata behind the button "Publish this Version" can be changed for old versions. Is this intended to be so?

Well, I do not think it is very critical. The date is set automatically with the button. But we can change that later on, if you want.

Hi @bozana @asmecher @NateWr @mtub

I added the versioning to the article and galley dois. 🐙 My current approach is to always add the versionnumber to the articleId (even if versioning is not enabled). The default setting would lead to the following DOI structure:

DOIs for articles:
präfix/%journalname. v %volume i %issue . %articleId . version
example: 10.1234/test.v0i1.60.1

DOIs for galleys:
präfix/%journalname. v %volume i %issue . %articleId . version . g %galleyId
example: 10.1234/test.v0i1.60.1.g115

Is that ok with you? Do you have other ideas? Any objections?

Hi,

There has been some talk about DOI's above, but has someone discussed this with for example CrossRef? I mean, what do they think of having DOI's for different article versions? Do they support such feature? This has to be resolved at least in the CrossRef plugin, which DOI to use and how to handle a new version of metadata.

@ajnyga, I wanted above all to take a deeper look at the CrossMark for that, and then to see with @jmacgreg what would be best to support/provide... but I haven't managed this yet :-( I will do this very soon and report here... then we could maybe also ask Crossref...

@lilients, thanks a lot! It sounds good for the moment. I will take a deeper look what Crossref supports and what would be best for us to provide...

Sounds good @bozana I failed to see that when reading this, sorry; and great job with this feature @lilients! Together with the continuous publishing mode it opens brand new forms of publishing.

@ajnyga Thanks for pointing out the dois plugins and the metadata for the doi registration. 👍

I found these metadata options for CrossRef and DataCite:

I hope thats sufficient. I will try to change the plugins accordingly (maybe @bozana can help 😇) .

@bozana The CrossMark service needs to be payed for if I got it correctly. But It might be a nice plugin...

CrossMark plugin is one thing, but the other thing is just to see how it works, in order to maybe understand how Crossref versioning works (behind just the metadata) and what is the best solution... I would say... :-) -- Those Crossref services seem to be connected to and thus influence each other i.e. how we assign the DOIs to the different versions and how we register them could affect something else, e.g. CrossMark...

I found this:

Crossmark “updates” should only be deposited for changes that are likely to effect “the interpretation or crediting of the work.” In other words, updates should only be deposited for editorially significant changes. Updates should not be deposited for minor changes such as spelling corrections, formatting changes, etc.

There are 12 defined types of accepted “update” within Crossmark. The values for these are:

  • addendum
  • clarification
  • correction
  • corrigendum
  • erratum
  • expression_of_concern
  • new_edition
  • new_version
  • partial_retraction
  • removal
  • retraction
  • withdrawal

If an update does not fall into one of these categories it should instead be placed in the "more information" section of the pop-up box by being deposited as an assertion.

https://support.crossref.org/hc/en-us/articles/115000108983-Getting-started

Maybe thats helping ...
Maybe we need to add the type of change to the versioning as well (as we discussed earlier) so the metadata can be created correctly.... 🤔

@ajnyga, just to let you know what I heard from Crossref: "If your article versions are distinct enough to have different metadata (different title and publication dates, for example) a new DOI should be assigned. You can connect the DOIs using what we call 'relationships': https://support.crossref.org/hc/en-us/articles/214357426"

Hi @bozana , @ajnyga ,

the Crossref answer seems a little vague to me. New versions of an article, even if they are quite large/important, e.g. a correction of a misinterpretation etc., probably don't get a new title, so if I am reading this correctly, they expect journals to not assign a new DOI. So we'd be stuck with other means, especially non-machine-readable means, to identify versions.

Hi,

I was unaware that CrossRef actually has some solution for this, nice to know that hey have. Do they actually limit the option to cases where the article gets a new title, or was that just an example? I mean, it makes sense that large changes to the article may happen even though the title does not change.

If this new feature would always create a new DOI, and CrossRef has clear rules when a new DOI is possible, then how easy would it be to make sure that journals use revisions correctly?

Hi @ajnyga , I fear that what Crossref is saying is "do not create a new doi"... maybe @bozana can help us out with the interpretation...

Well, I actually first thought that Crossref means that a new DOI can be assigned if there is a significant change, but now, after @mtub doubted, I am a little bit confused :-) I will double check!
When CrossMark is used, a new DOI can be assigned also when the full text changes, so I thought that it is the same without CrossMark... Also, the new article version would have a new date published (a metadata change) so I suppose we can assign a new DOI... but let me double check... :-)

Let me be clear: I think that Crossref is in favor of assigning DOIs for significantly changed works, but that they are not in favor of assigning DOIs to any slightly modified version. So there might be a conflict when OJS has new versions as a default (which I still think is a good idea) and when we don't want to include a manual option where the journal editor has to decide on a single case basis if this new version is big enough to get a new DOI (like with commits (= versions) and releases (= doi) on GitHub).

And: The requirement ("new title") seemed a little to much for me as a requirement for a new DOI.

I think "new title" was only an example. They mentioned also a new date as possible metadata, so this should be ok. But true, as I read it, they only want DOIs if there is one of the 12 update types (https://github.com/pkp/pkp-lib/issues/2072#issuecomment-290460923). But I think this is a question of policy. The CrossRef Plugin should inform users of OJS, that they should only create new versions if it one of those types. Later on we could also add a support for this by adding the update types to the versioning features. Like a dropdown with the update types at "publish this version" form.

Those types are tightly connected with CrossMark, so when we implement the CrossMark support, we will have to provide them for the users to chose. There is the type "new_version", that could mean everything :-)
They all (e.g. also German National Library) say that only some major changes count (and for example not the missing comma or so)... which again leads to my earlier assumption/'being afraid' that we will have to consider the case, when a new file is uploaded but a new version and thus DOI not created... But, lets keep it simple as possible for now... and continue as we decided... :-)
For the normal DOI registration we can then use "isVersionOf" relationship, I think...

@NateWr I want to add inline help on the article versioning at the production stage. Is it ok to add info on features that need to be enabled first? Or is there a way to change the info depending on the settings of the journal?

Hi @lilients, sorry for the delay (I'm just back from a holiday). Yes, go ahead and add information about versioning to the slide-in help panel. Just be sure to indicate that it may not appear, with text like:

If you have enabled versioning for this journal, ...

And then somewhere in the section be sure to indicate how to enable versioning:

To enable versioning for this journal, browse to the the Settings > Workflow > Versioning page and...

(Or wherever the setting is.)

@asmecher, here are the PRs:
pkp/ojs#1277

2307

Hi everybody,

I am starting again to adapt the versioning code to the latest OJS version (so that it can be merged in the master branch before the 3.2). I have some conceptional changes I would like suggest:

1. remove option to hide versioning in settings

  • would make the feature itself more visible
  • users would be encouraged (but not forced) to use the versioning feature for changes in files
  • versioning view at production stage is very discreet (if you do not want to use it, it's no big distraction)
  • new versions can only be created if the first version has been published (users can not create new versions "by accident")

    2. remove static page and advice user to enter text into "about the journal"

    • we do not need the settings page anymore :+1:

      3. allow users to name versions

    • could be used to publish preprints

      4. allow users to write a description of the version

    • has been requested a lot

Looking forward to your feedback! :sun_with_face:

Best,
Svantje

Hi @lilients

I've already forgotten what was the static page for :-( ?
How would the version name be used?
Could we consider the Crossref Crossmarks values for version description, s. https://www.crossref.org/get-started/crossmark i.e.:

  • addendum
  • clarification
  • correction
  • corrigendum
  • erratum
  • expression_of_concern
  • new_edition
  • new_version
  • partial_retraction
  • removal
  • retraction
  • withdrawal

plus "other" with free text?
Or what do you mean with description?

Thanks a lot!!!

Hi @bozana

the static page is described here: https://github.com/lilients/ojs/wiki/Article-versioning-in-OJS
It adds a versioning policy as a new page under "about the journal".

The names would be displayed at the article page in the place of "Version 1" etc. See: https://raw.githubusercontent.com/lilients/img/master/versioning_howToCite1.png
I would implement the name as a free text.

I wanted to use a simple free text for the description as well. I thought maybe the editor wants to describe the changes that has been made. Maybe later this could be used to display an automated change log.

I like the idea of using a vocabulary. But I am not sure where to use it ... Maybe a third optional input, something like "type of version"?

Thanks!

:+1: to the idea of using a vocabulary for version type _and_ a text field for a preferred name. (Ideally, a default name would be filled out when a type is selected.)

I also agree about removing the static page.

Hi,

Using the vocabulary from crossref is a good idea!

Regarding preprints, I am not 100% sure if that should be covered here, but I think it is very close to the topic. This is how I see it.

At the moment OJS has galley files and these files always represent the _final article_ or a supplementary file of that final article. The versioning in these cases means that the final article is changed in some ways but all the versions were originally intended to be the final version of the article.

The preprint is not a final article but it too could have different versions.

So my view is that all galley files should have two types: preprint and final print (the latter is of course what galley files are at the moment in OJS and the default)

When you add a galley file, you can choose what type it is. Alternatively we could have a separate "preprints" grid in the submission stage that adds the preprint galley file type automatically to files you add there (https://forum.pkp.sfu.ca/uploads/default/original/2X/9/9f9d3a73fdc0215849e83654e091c9593b43d07e.png)

If the article is given a preprint type galley file, the system will show the preprint file and the article metadata connected to that version of the article publicly. This is basically the "preprint server" which could be both context specific and site wide. Basically browse tools to search for preprints and a handler for the preprint abstract page (I do not have a clear idea yet whether the url should be the same as for the final article, probably it should). On the abstract page you could have tools for public review etc. but these are of course separate features.

When the final article is published, the preprint is shown as an early version of the published article using the versioning features here.

With the versioning you could also give the preprint version of the article a DOI and then give the published article a new DOI and tie those two DOIs together using the possibilities Crossref has.

If we just name a galley version as a preprint (I mean just give it a text label), it would probably be hard to build preprint server functionalities to the front end?

I'd prefer to stick to preprints as a full version of the article, not just a version of a galley. It's best to capture a complete representation of the article, and a galley usually only captures part of that. In a future where we have integrated tools for treating all article data as an object (eg - Texture), the galley will be an insufficient record.

So I think that preprint should just be another type alongside the crossref vocab, if @bozana thinks that plays ok with crossref's plans.

You are right @NateWr 👍

The preprint should be a status of the whole article. With this approach the name label for the galley file would probably be enough to define which galley versions are the ones attached to the preprint.

Not sure though if preprint should be another type in the crossref vocab. Because you should still be able to define a new_version of a preprint as well, right?

Here are crossref guidelines for preprints https://support.crossref.org/hc/en-us/articles/213126346-Posted-content-includes-preprints

Thanks for the link to the crossref docs. It looks like they've introduced some specific types for preprints.

  • preprint
  • working_paper
  • letter
  • dissertation
  • report
  • other

I think that those are not types for preprints, but types for "posted content" and preprints is just one of those types. Types like dissertation or report are probably not needed in the context of OJS anyway? My view is that OJS would only use the preprint type?

But anyway, if we could publish articles with a preprint status in OJS, then posted content would be what should be used in the Crossref end for DOI registration.

Maybe we can leave this preprint/posted content discussion/solution for later, in another issue? -- for now we would consider everything as journal article, as till now... -- I have a feeling that we would need to consider a few other things with that too... and we already have enough to consider with versioning introduction and normal journal articles as they are/were till now (e.g. reports, statistics, exports...)... What do you think?
So I think that @lilients can now rebase the current versioning code and proceed as she suggested, if possible add name (input text field), type (controlled vocabulary) and eventually description (input text field)... so that we can further consider the current/normal (without those preprints etc.) versioning in all those other system parts for OJS/OMP 3.2...

Hi everybody, I created pull requests to add the basic versioning functionality to the master branch of OJS: https://github.com/pkp/ojs/pull/2103 and https://github.com/pkp/pkp-lib/pull/4007.

There are still some bugs (see below) but I would suggest merging the current state to prevent me from rebasing for the hundreds time. :weary: :wink: I will work on the fixes next week, and would be very happy if somebody else could have a look at some as well. :angel:

This is working now:

  • changing the database structure for versioning via tools/upgrade.php upgrade
  • you can create new versions via button at the backend (production stage)
  • authors and galleys are automatically copied to the new version (displayed in a new tab)
  • you can change the metadata of the new version (not the old ones) and publish the new version when ready
  • display of the versions in the frontend

These bugs need to be defeated:

  • [x] new submission wizzard seems to have some problem with a version field in a where clause
  • [x] authors strangely loose their first and last names when new versions are created
  • [x] adding and deleting galleys is not yet working
  • [ ] versions are displayed at the submissions grid but there should only be articles
  • [x] locale ##submission.production.notPublished## does not resolve

Great work! Looks like one of the validation tests fell over, so I restarted that to see if it fiinishes. There is one consistent test failure though:

There was 1 error:
1) CcorinoSubmissionTest::testSubmission
Current URL: http://localhost/index.php/publicknowledge/submission/wizard/2?submissionId=1#step-2
Screenshot: http://localhost/lib/pkp/tests/results/7ef9c66fa48c7577d8251e58e2c81b61.png
WaitFor timeout. 
Last exception message: 
Failed command: waitForElementPresent('id=cancelButton')
Failed asserting that false is true.
Caused by
WaitFor timeout. 
Last exception message: 
Failed command: waitForElementPresent('id=cancelButton')
Failed asserting that false is true.
/home/travis/build/pkp/ojs/lib/pkp/tests/PKPContentBaseTestCase.inc.php:102
/home/travis/build/pkp/ojs/tests/data/60-content/CcorinoSubmissionTest.php:33
FAILURES!
Tests: 9, Assertions: 208, Errors: 1.

Would you be able to look into that and see if you can spot the issue?

Found it (the database column version was missing in author_settings), fixed it (via force push).

Oh, I just noticed that you'll need to do a couple things to kick off the OJS tests properly. To make sure the tests know where to look to find the right lib/pkp to check out, we add the submodule at the end, with a commit message like this:

git commit -m "Submodule update ##lilients/<branch-name>##

So, I think you can run the following from the command line of the OJS repo to get the tests running:

git reset HEAD~1
git add <all-changes-except-lib/pkp>
git commit -m "pkp/pkp-lib#2072 Add support for versioning articles
git add lib/pkp
git commit -m "Submodule update ##lilients/2072##"

If that looks good, you can force push. This should restart the tests in the PR, and the tests should check out the correct submodule commit for the testing to work.

Thanks, @NateWr, and huge thanks, @lilients and team! I agree that we should merge this as long as the tests pass and the system is basically usable. For what it's worth, @lilients, I think the next few weeks at least shouldn't present as many rebase challenges as they have lately. We had lots of broad-based changes to merge and I don't think we have any more coming up in the next while.

@NateWr I added the lib/pkp in a seperate commit, but now github displays a conflict at the ojs pull request, although the lib-pkp pull request is fine. :thinking:

Hi! The tests are passing now! :tada: Next week I will add two small fixes (some data is displayed double: authors and submissions in the queryList). Also I am not sure if the database structure for authors is final. Maybe we should separate the author data from the connection between author and submission. I guess we should talk about this before merging. But if you want, you can have a look at the code and test and see if there are other problems I did not see. :rainbow:

Great to have this feature!

One question: if I understand correctly, the new feature will enable us to have different versions of the article metadata. One part of this metadata is of course the journal name, publisher name and ISSN. So what happens, if a journal changes their ISSN or journal name? This will effectively create a new version of the article metadata, but it will not be tracked with this feature?

I think OJS would really need a versioning feature also for the journal name, publisher and ISSN, because these can change. And this is probably something that needs to be considered also here in the article metadata versioning.

I don't share that view, @ajnyga - if a journal changes name and ISSN later on, the original article will still be published with the old journal name and ISSN. Metadata changes regarding the journal itself should be reflected in search engines, library systems etc. so that you can still find the old articles. But updating the journal name of an already published article would be very misleading in my opinion…

I agree @mtub and this was my original concern (although I probably was not too clear about it in the first message).

Currently, and after this feature is released, the situation is that when a journal changes their name in the OJS settings, it will affect the metadata of all published articles. The versioning feature here will not track this change, but for example OAI-PMH will show new metadata for all articles. Also journals with back issues published with an old name or ISSN or publisher have a problem that should be solved.

What we could do actually is to add the journal name, ISSN and publisher name to the article metadata (submission_settings table). So that when you publish an article, you can fill those fields for all individual articles as well. And for new articles those could also be automatically filled based on the journal settings when the article is first published. This would solve the problem with back issues with different names and would enable OJS journals to change their names, ISSNs and publisher names > the article metadata in old articles would not be affected in any way.

(This just came to my mind actually. Earlier I was thinking a system where you would have a function like getTitle('YEAR') and it would fetch a journal name that would correspond to that time period and in the settings you could fill a series of journal names with time limits. For example Journal of History 1999-2007 > Journal of World History 2008-)

Hi everybody. Thanks for your support and thoughts! I am sadly running out of time and can only take care of the current state and not add more fancy features.
All tests are passing and the new features can be used (pdfs only without the pdf js viewer plugin - that needs to be adapted). Two small bugs are still there (I will fix them in the last week of september). Maybe the code can be merged anyway. Have a look at the code and get back to me if you find major bugs or other problems.

  • [ ] versions are displayed at the submissions grid but there should only be articles
  • [ ] new versions are displayed at the issue toc before they are published

@lilients, I suspect that these changes have not been adapted to OMP, have they? (Unfortunately that's a blocker for merging immediately, as that would break the master branch of the OMP repo.)

@asmecher yes I feared something like that. :frowning_woman: OMP would need to be adapted

I don't see any away around it but porting this to OMP as part of the merge. I'm too buried to take a look at it now, but I'll try to make it a priority later this month.

I thought about the adaption of the versioning to OMP - I don't think it is a major problem. Most stuff can be taken over:

GUI

The GUI concept (tab views for each version at the backend, display of versions at book page with version history) can be the same in OMP. I created a mockup for the backend view:

img_versioning-omp

Database

The database changes for article versioning made for OJS (see https://github.com/pkp/pkp-lib/issues/2072#issuecomment-277698386) need to be made to the OMP structure as well (the one thing different in OMP is the fact, that you can assign more than one file to a publication format):

img_versioning-omp

Other changes (like moving the publication format from published_submissions to submission_settings) can be taken over

The rest will be diligence. ;)

Hi all. I am leaving here some thoughts on versioning for OMP.

Regrding Chapters:

  • the db table submission_chapter_settings: it should be extended to have a submission_version column.
  • The contributors and files section should refer only to the currently working-on version's contributors and files.

So, chapter_id=1 with title = Title 1 and version = 1 could have access to the files and contributors of submission_version=1 (etc)

Regarding the Representatives:

  • the db table representatives maybe should be provided with a new column submission_version (as, all the metadata of the representative is in that table)

Regarding sales_rights and other publication_format specific data like sale_rights (markets, publication_dates):

  • the db table sales_rights maybe should be provided with a new column submission_version.

Regarding publication_formats:
The issue here is that the metadata of publication_formats are distributed across 2 tables:
publication_formats and publication_format_settings. Perhaps the data in the publication_formats table should remain not editable across versions?

Regarding Cover Image and Audience fields:
The data are stored at the published_submissions db table. Each version should have its own published_submissions row entry perhaps?

Hi @defstat thanks for your brain power! :)

I am not sure yet, if all these tables will need the submission_version. I tried to add it to as few tables as possible. Chapters makes sense to me, what are the representatives for? The publication_format specific data would not need to be versionized (see below).

Regarding publication_formats:
The issue here is that the metadata of publication_formats are distributed across 2 tables:
publication_formats and publication_format_settings. Perhaps the data in the publication_formats table should remain not editable across versions?

Yes, the solution for OJS is the same: the galley metadata will be the same for all versions. I think that makes sense as well - if we allow users to change the name of publication formats or galleys the reader can not connect different versions at the frontend.

Regarding Cover Image and Audience fields:
The data are stored at the published_submissions db table. Each version should have its own published_submissions row entry perhaps?

My solution for OJS was to move all data that will be relevant for versioning into the settings table. I would propose to do the same with the cover image and other data.

@lilients Is this implementation still in development? Or is there already an operative version to install in OJS? :)

Hi @neoyukito thanks for your interest. @defstat and me are still working on the adaption to OMP.

Hello @lilients and @defstat, by chance this improvement is available in the latest version of OJS?
I would like to launch a preprint journal :), but when I installed the version on my test server I could not find it.
I have the doubt if this improvement is in the main line? (main branch), or if it is in development, because I see several repositories linked.

@borisalmonacid, this is not yet merged into master. Our current plan is to release it in OJS 3.2.0. It'll take some time to mature and will probably make master unstable/unpredictable for a while, so I'd definitely recommend waiting a while, even if you're comfortable with pre-release code.

Work on this is now being tracked in the versioning project.

Was this page helpful?
0 / 5 - 0 ratings