Dataverse: API Token and Data Curation Tool

Created on 20 Mar 2019  路  27Comments  路  Source: IQSS/dataverse

Data Curation Tool is an external Tool for editing variable level metadata. It needs to get an API Token to update data. But when users create account in Dataverse or an account is created for them API Token is not generated automagically . So then API Token is empty users cannot edit metadata and have to be reminded to create API Token.

But there is a possibility to generate such token automatically in ConfigureFragmentBean in getConfigurePopupToolHandler of "Configure" button such that when the token is empty and user is authenticated the token can be generated automatically. In this case users would not lose the data that they started to update in DCT only to learn that they forgot to generate the token.

I would like to know what dataverse community thinks about it. Is it ok to generate token in Configure section?
This issue is related to #4174 and #4448

All 27 comments

Hi @lubitchv we had a previous issue for this, #4310 where at the time we were leaning towards having the user "opt in" to the token being created. This would of course require a UI around this opt in capability. I think having it be automatically generated would be fine too, but there should probably be messaging so the user is aware of it happening. (since an api token is effectively another password and the user needs to understand it's security needs).

I'd have to refresh my knowledge on the code, but I think getConfigurePopupToolHandler, just by displaying the button? if so, I'm not sure if that's the best place for it, since it would then be generated for every user who visits a dataset. Unless we're ok with that, but if so, we should probably just generate at user creation time.

(I could be mis recalling when that code is called)

My first thought is this:

  • Users who are created via the GUI do not automatically get an API token.
  • Users who are created via API do automatically get an API token.

This is inconsistent and was observed recently in 5f383da ("not all users have apiTokens").

Should we introduce a system-wide boolean setting called something like :GenerateApiTokenOnCreate? Installations of Dataverse that have security concerns with API tokens being created could set this to "false". Developers would set this to "true" because API tokens are used in integration tests. Scholars Portal and others could set it to "true" to make sure their users all have API tokens. (Perhaps an API endpoint could also be added to generates API tokens for existing users.)

These are my first thoughts but I'd love to hear some discussion from the community about this! 馃槃

@scolapasta not exactly. getConfigurePopupToolHandler has a check for fileId is null or toolHandler is not null.In that case it does not go to retrieve API Token. This is also the case when user just visit dataset. I suggest to add the code in case retrieval of Api Token brings null. It should only work if user actually press on External Tool in drop down.

It is also probably possible to add the api generation instead in ExternalToolHandler in getQueryParam.

@lubitchv If that's the case (that it is only when user actually clicks on the tool), that's fine. I still think it could be good to add some messaging somewhere that a token will be generated, so that they are aware, but not sure how/where and would want to see what @TaniaSchlatter and the design team thinks.

@scolapasta let's define one or two user scenarios that describe who needs this and in what situations.

FWIW - for QDR, we've added this in FileDownloadServiceBean.explore() (no PR at this point), handling both the case where a key doesn't exist and where it has expired. In our case, an API token is needed to be able to get the latest metadata when previewing files in draft versions (e.g. to pick up the correct file name) rather than wanting the external tool to make edits. Automating api token mgmt would make it easier for people to use the Previewers I mentioned in the community call.

My thought, w.r.t. alerting users/keeping the api key 'optional' is that, while this works OK if it is only used by a subset of users who are sophisticated enough to call the API directly, if/as an API key works more as a way for tools, configured by an admin (who's making the choice to let externaltools get the apikey when configuring teh tool), to integrate on the user's behalf, they key should be automatically created/updated.

@qqmyers @lubitchv thanks for you thoughts on API tokens. I thought I might be interesting for you and others to take a look at the "API Token Functional Requirements Doc" doc from 2014: https://docs.google.com/document/d/1PfQtblJxPv975U_cTOIA7OR71MT4CslP-s2n83eb1i8/edit?usp=sharing

I'd like to highlight that back then we had some concerns with our design, such as this:

"Currently, API tokens give you all the permissions of the user in question. There are no "read-only" API tokens, for example"

At the time, we were trying to get 4.0 out the door. :smile:

I'd also like to point out that the integration between Dataverse and WorldMap via Geoconnect makes use of special tokens that were added before API tokens were added. They're stored in the "worldmapauth_token" table: http://phoenix.dataverse.org/schemaspy/latest/tables/worldmapauth_token.html . I believe that in #5597 we're hoping to someday revisit how this integration works, to pull out some of these tables related to a single integration (WorldMap) that need deeper integration that what's currently offered by the external tool framework.

@pdurbin I just wanted to notice that DCT will need a token with permission to edit dataset, (those users who have such permissions), hence "read-only" API tokens would not be helpful in this case.

I've seen systems that try to manage multiple tokens per users where the tokens may have different permissions and/or expiry dates. It's always seemed a bit overkill/off-target though, since users aren't really in a good position to know what they should be trusting.

The approach I've preferred is to use one token but to then have things like the external tools API provide a dynamically generated key that binds the token and the specific API calls allowed together, e.g. the key sent to the tool is the hash of the allowed URI call and the 'real' apitoken. The API then checks the hash against the actually API call made plus the real apitoken and only allows the call if it matches. This allows very fine grained control of what things like tools (or workflows, etc.) can do - controlled by the admin who configures them, without requiring the user to manage anything (except cancel/refresh of the master api token if they think there's been a compromise).
There are some potential downsides/decisions going this way. For example, you need to send the username in the call since there's no way to look them up based on the key being sent. Also I think one could tweak the logic to provide a key that can be used for more than one API call (e.g. that is hashed with 'persistentId=https://doi:...' rather than the full URL so it could be used with any API call about that ID, etc. (Still limited by the user's overall permissions), but that can probably get complex fast.

@amberleahey - per the community call: Rather than a PR that doesn't have any of the user notification changes - here's a pointer to the 3-line if statement that checks for an api token in the 'explore' call and generates one if there is none or if it has expired... https://github.com/QualitativeDataRepository/dataverse/blob/develop/src/main/java/edu/harvard/iq/dataverse/FileDownloadServiceBean.java#L251

@qqmyers Thank you, but we do not use explore, but configure for dct tool. But I will add something similar into configure code, such that when the user press the button and it does not have token, it will be generated.

I missed the community call yesterday but I posted the notes at https://groups.google.com/d/msg/dataverse-community/X_F0ABmmEzo/1ynJK40UAwAJ

Afterward I chatted with @djbrooke @scolapasta and @TaniaSchlatter about this I think we more or less agree that if you generate a token for a user on their behalf, an in-app notification and email should be sent to them with a bit of a warning that it should be treated like a password. Ordinary users can delete data with API tokens and I don't want to even list all the evil that can be done with a superuser token in the wrong hands.

I took a quick look at pull request #5755 and I'm glad to see that email and in-app notifications have been implemented. For example, it looks like the user will get an email saying, "API Token has been generated. Please keep it secure as you would do with a password." Sounds great. Thanks, @lubitchv

OK, I was trying to figure out what this did finally. I was pointed to this related ticket:
https://github.com/IQSS/dataverse/issues/5757
So it seems that a user trying to access an external tool which requires a token (configure versus explore tool?) and the user does not have a token, user will be presented with a popup informing them a token will be generated and allowing them to continue or cancel. If continue, token is generated, email and online notifications sent about token generation, and tool is launched.

@kcondon Yes, it does work like that.

configure and explore tools are launched through different code, so more work would be needed to make explore tools do the same thing (I think both could leverage the same mail code in #5755, etc.)

@lubitchv Thanks. I'm having trouble triggering the popup. I have enabled pseudo tools, both as explore and configure as well as dataExplorer from the guides but nothing causes the popup. How would you recommend I do that?

@kcondon Popup window worked before for Configure button, without API generation. I just added text there about API generation. Explore button does not have popup menu, so you should not get popup menu for dataExplorer. Did you run curl command with corresponding json for Configure?

@lubitchv Thanks, I'm seeing the popup and it is generating the token. Please note that while I do see an email notification, I do not seen an online notification about token generation.

@kcondon There should not be an online notification, I did not implement it, I thought it to be an overkill, since we already have a popup window warning about token generation and email notification. This is what I thought @pdurbin meant. Online notification is only for the use of external tool and a need to refresh the browser, after using the external tool. It existed before I implemented token generation. The reason is that it caches file ids and external tool would not work properly without refreshing, but it has nothing to do with token generation.

@lubitchv could you please add the in app notification? While it might seem like overkill, it's how we currently handle notifications throughout the app (i.e you get both an in app notification and an e-mail) and it would be good to keep this consistent. (I'm actually surprised the code allows you to do one and not the other; I would have preferred it that the developer would only have to call a method to "notify" and it would handle the mechanisms of how that notification happened. Something to look at if / when we clean up the notification subsystem.

I don't actually follow what you mean by "Online notification is only for the use of external tool and a need to refresh the browser, after using the external tool."

@scolapasta Sorry, it is my mistake. Buy online notification I thought that @kcondon meant yellow message in the browser. Now I understand that these are different things. The yellow message appears after external tool is used warning him to refresh the browser, but it is not online notification. Online notification is different. I will implement online notification for api token generation as you asked.

@scolapasta @kcondon I added Online Notification for API Token generation.

Thanks @lubitchv, we'll take a look.

@lubitchv Hi, would you update from /develop since we have just released v4.13? I'll test and merge asap. Thanks!

@kcondon looks like @lubitchv merged the latest in 9645cec

@pdurbin @kcondon Yes, I did merge.

Was this page helpful?
0 / 5 - 0 ratings