Dataverse: 404, not found when providing `format=original` in Data Access API

Created on 25 Nov 2019  路  8Comments  路  Source: IQSS/dataverse

Not Found error, 404 is returned when attempting to download a file in its original format.

Use cases

Data has been ingested

The dataset contains at least two files, the original file and the ingested version.
For example, a .csv file is uploaded, Dataverse ingests the file and converts it to .tab, keeping as original the .csv

When we download the file, we could specify whether downloading the ingested one:
GET http://$SERVER/api/access/datafile/$FileId

or

Downloading the original version
GET http://$SERVER/api/access/datafile/$FileId?format=original

The above scenario works as expected

Data has not been ingested

The dataset contains at least 1 file, the one that was uploaded

GET http://$SERVER/api/access/datafile/$FileId

GET http://$SERVER/api/access/datafile/$FileId?format=original returns 404

Questions:

  • Is the above an expected behaviour?
  • Should the API not return the only file available when retrieving the original?

Notes:

The API guides already specify that the format query parameter is only available for tabular data

All 8 comments

@tainguyenbui I think you're saying that you are unable to download a file that wasn't ingested. No, this is unexpected. Many, many files are not ingested (only certain tabular formats are) and they should all be downloadable. Can you reproduce this problem on https://demo.dataverse.org ?

Sorry @pdurbin I might have explained it wrong. I am able to download files. However, there is a behaviour that does not sound right to me. If you had a file that cannot be ingested and keeps its original extension, then when you call
GET http://$SERVER/api/access/datafile/$FileId?format=original
it returns a 404. I don't think that is completely right though, because if the file was not treated at all, when I request its original format I should still be able to download the file.

Of course, the above request without the ?format=original does work.

@tainguyenbui ok, I was confused by this:

Screen Shot 2019-11-25 at 10 34 20 AM

Is it really a 404 in the case above?

my bad @pdurbin, I corrected it

@tainguyenbui thanks! It's all much more clear now. 馃槃

To boil it down... are you saying you'd like format=original to always allow you to download the original file so you don't have to think about what kind of file it is?

If so, I feel like this idea has been discussed before but I don't have any issues handy to link to. 馃槃

In this case, I do not want to modify an existing behaviour unless it makes sense to everyone.

I would love to discuss and understand why it would not return the actual original format if the file has not been modified.

thanks a lot for your lightning responses @pdurbin

@tainguyenbui well, you'll never get consensus, of course. 馃槃

I guess we could add a brand new query parameter rather than changing the behavior of the old one.

Do you have a work around? From your comment at https://github.com/IQSS/dataverse/issues/6385#issuecomment-558052834 I'm guessing that you might be checking for the presence of fields like originalFileFormat or originalFormatLabel to know if the file was ingested or not.

I'm curious if you're allowing users of https://github.com/IQSS/dataverse-client-javascript to not worry about these details. You could offer them a "download original file" feature that does some checking and gives them what they want without thinking hard about this stuff. 馃槃

@pdurbin you are very very close to the approach we have taken. We have two different endpoints in our own backend, one finishing with the path /original.

Then, I have updated the client to have a flag, 'getOriginalFile' which is currently false by default and will not append the query param ?format=original at the end.

Right now, we are looking at the optional properties original... to determine whether we should be asking for the original or not 馃槵

of course, a lot of pain could be saved if this was handled at the very end 馃ぃ

Was this page helpful?
0 / 5 - 0 ratings