Dataverse: Bug in DDI Codebook Export XSLT

Created on 4 Jun 2020  路  12Comments  路  Source: IQSS/dataverse

https://github.com/IQSS/dataverse/blob/66ed398483679081cd723a807ad85f35c2221097/src/main/resources/edu/harvard/iq/dataverse/codebook2-0.xsl#L180

"ICPSR" is hardcoded at Line 180. From the document format, I think the distributor must be the intended information for that place, since it's paired with the DOI. It appears on the first line of the formatted Codebook output.

I have a pull request coming with a fix to the xsl. I'm afraid I may have mucked up the order of doing this, but I did my best! It's my first Dataverse pull request.

Metadata Bug

All 12 comments

@johnhuck wrote in the PR that:

Today I find myself still pondering whether Distributor is the right information to replace with here. Hard to know what the original intention behind the stylesheet was. But since "ICPSR" was hard-coded, it must represent something invariable, so author/creator seems less likely. If there were any doubts about this, though, the "ICPSR" string could simply be snipped out of the original stylesheet, leaving the DOI presented on its own in the brackets.

I agree. Reading the conversation about that HTML Codebook in https://github.com/IQSS/dataverse/issues/739, I wonder if what's in the parenthesis in that first line is following any sort of convention or if it would be better to just remove and not replace the hardcoded ICPSR.

Could @amberleahey and @lubitchv share their thoughts?

I tend to doubt that it follows a convention, so it's perhaps more a question of what we think Dataverse users would find most useful.

I tend think the Distributor (Dataverse name) would be useful information at the top of the formatted Codebook html -- a simple way to give some basic context -- but I agree that getting some different thoughts and consensus is the best way.

@jggautier @johnhuck This codebook2-0.xsl was taken from DDI alliance website as a sample (ICPSR codebook stylesheet ) to create html. That is why probably ICPSR was hardcoded there. I guess nothing prevents us to remove it or replace with something more meaningful related to Dataverse.

Thanks @johnhuck and @lubitchv. In the spirit of doing things in small chunks*, could the ISPSR be removed and not replaced for now?

Then investigating how to improve the Codebook HTML could be its own issue?

*small chunks is also the name of @kcondon's dog 馃惗

@jggautier @lubitchv I think this issue is small enough that it would be better to decide now and then move on, rather than opening another issue. I don't know of any other improvements needed for the Codebook HTML, although I'm sure if one went looking...

If the simplest way to consensus now is to remove ICPSR, I'm cool with that. I just thought I would bring a solution along with the bug :-)

I'm inclined to not repeat the Dataverse repository name at the top of the page (it's already in the Citation section), but I don't know in what context people usually see that Codebook HTML and if/how displaying that information more prominently helps. If others have experience with how the HTML Codebook is used and think having the repository name is helpful, I'm fine with that :)

Opening another issue would be to track the work of looking for ways to improve the Codebook HTML (including learning about how it's used). One hunch is that expressing PIDs as URLs and making them clickable links might be better (for all of the reasons that DataCite recommends doing that).

@jggautier I'm happy to accept your two suggestions. If you'd like me to withdraw or modify my PR, if you could help guide me through the steps I need to take, I'd appreciate that. As I say, I'm still learning the ropes.

You make a good point about the opportunity to improve the HTML Codebook display with links. I find it refreshing to see all the metadata/data in that format, and I have a hunch that it's a feature more people will come to discover and appreciate over time. So perhaps a new issue is warranted. Cheers!

I agree with @jggautier 's suggestion to simply remove "ICPSR " for now, the part that I've highlighted in blue below.

Screen Shot 2020-06-05 at 2 55 26 PM

As Julian said, we could always add something back in later.

Yes, that's easy to remove in the stylesheet. I'm just finding it confusing to figure out how to modify or delete my PR.

@johnhuck you are welcome to make a fresh PR. They're free. 馃槃

@pdurbin OK, I think I've got one ready. Do I submit it to the Develop branch or a different branch?

develop, please! Thanks!

Was this page helpful?
0 / 5 - 0 ratings