Arctos: Basis of Record vs. Cataloged_Item_Type

Created on 8 Jan 2020  Â·  25Comments  Â·  Source: ArctosDB/arctos

I was recently made aware of the fact that fossil specimens in Arctos are not being properly translated to aggregators. If I search GBIF for UTEP Fossils (Arctos) with BasisOfRecord = "fossil specimen", I get nothing, yet this entire collection is fossils. This is going to be an issue as ALMNH:ES and NMMNH:Paleo go into GBIF. While we could take the easy way out and just send all ES collection types as "fossil specimen", I think we should be more precise as there are fossils in other collections as well. Also see #2094

I propose that we make better use of CATALOGED_ITEM_TYPE and use the categories suggested in GBIF for Basis of Record:

Observation
Machine observation
Human observation
Material sample
Literature
Preserved specimen
Fossil specimen
Living specimen
Unknown

This would also provide better choices for cultural collections.

Aggregator issues Function-CodeTables NeedsDocumentation Priority-High dwc terms

Most helpful comment

add the following terms and definitions:

I want to advocate for using the DWC terms, but in this case they're a little wonky for humans. If we just use "PreservedSpecimen" then the mapping to DW will be straightforward, new values won't require rebuilding code, users won't have to guess how we've translated, etc. - but it'll say "PreservedSpecimen" on records in Arctos.

If we go with eg "preserved specimen" then we do have to translate - keep our local definitions synced up with DWC, run code like below for export, etc.

I have no strong feelings, but I think it's worth discussion before we change anything.

    when CATALOGED_ITEM_TYPE='specimen' then 'PreservedSpecimen' 
    when CATALOGED_ITEM_TYPE='observation' then 'HumanObservation' 
    else null 
  end basisOfRecord,

All 25 comments

A point of information, according to the Darwin Core standard,
"Observation", "Literature" and "unknown" are not valid values for
basisOfRecord.

On Tue, Jan 7, 2020 at 9:09 PM Teresa Mayfield-Meyer <
[email protected]> wrote:

I was recently made aware of the fact that fossil specimens in Arctos are
not being properly translated to aggregators. If I search GBIF for UTEP
Fossils (Arctos) with BasisOfRecord = "fossil specimen", I get nothing, yet
this entire collection is fossils. This is going to be an issue as ALMNH:ES
and NMMNH:Paleo go into GBIF. While we could take the easy way out and just
send all ES collection types as "fossil specimen", I think we should be
more precise as there are fossils in other collections as well. Also see

2094 https://github.com/ArctosDB/arctos/issues/2094

I propose that we make better use of CATALOGED_ITEM_TYPE and use the
categories suggested in GBIF for Basis of Record:

Observation
Machine observation
Human observation
Material sample
Literature
Preserved specimen
Fossil specimen
Living specimen
Unknown

This would also provide better choices for cultural collections.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/2432?email_source=notifications&email_token=AADQ727F4MWQAPYKJKEBMMTQ4UKUDA5CNFSM4KEA2VXKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IEUBSLQ,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AADQ722WYNFGSPJXUDSGXVDQ4UKUDANCNFSM4KEA2VXA
.

just send all ES collection types as "fossil specimen",

I think that's an overly-coarse split, at best - they contain lots of casts and such, along with the occasional gooey-bits (http://arctos.database.museum/guid/UAM:ES:4588) and who knows what else.

GBIF

I'm definitely a fan of using existing vocabulary, but first glance suggests those are overly-arbitrary terms. Do they happen to come with definitions?

At some level, this seems like something we should be pulling from existing data, rather than expecting someone to update yet another field when this changes. Denormalization is bad....

In any case, https://arctos.database.museum/info/ctDocumentation.cfm?table=CTCATALOGED_ITEM_TYPE exists and is available in the UI.

"Observation", "Literature" and "unknown" are not valid values for basisOfRecord.

@tucotuco what ARE valid values - I couldn't find anything to save my life. There are "examples" in the DwC wiki, but no list of defined values.

I think that's an overly-coarse split, at best - they contain lots of casts and such, along with the occasional gooey-bits (http://arctos.database.museum/guid/UAM:ES:4588) and who knows what else. I'm definitely a fan of using existing vocabulary, but first glance suggests those are overly-arbitrary terms. Do they happen to come with definitions?

No definitions that I could find - I didn't have the time yesterday to write any and yes, there are probably some terms that should be added.

At some level, this seems like something we should be pulling from existing data, rather than expecting someone to update yet another field when this changes. Denormalization is bad....

We are already denormalized. ES collections contain stuff that isn't fossil and Inv contain fossils. Sometimes this can be figured out by the "(fossil)" added to a part name, but other times not. If you can show me how this can be pulled from existing data and have it be correct 95% of the time, I'd love that, but I'm pretty sure it won't work that way.

No matter what, we need to get something to make sure that fossil specimens are designated as such. The mammal curator at NMMNH just pulled a bunch of stuff from GBIF (he needs more than just stuff in Arctos) and ended up with a bunch of fossil mice from the UTEP collection. He knew this was a problem because he is familiar, but others probably wouldn't. The date of collection for that recent fossil stuff can be misleading and this will probably lead to bad science at some point.

ES collections contain stuff that isn't fossil and Inv contain fossils.

That's not denormalization, that's just missing the pigeonholes we've created. Denormalization is saying the same thing multiple places - being 'required' (which won't happen) to update A when you update Z.

show me

I think that depends on how we define 'fossil.' For the purposes of GBIF, 'cataloged in an ES collection' may be sufficient. Some users will find some casts and fail to find fossils cataloged in bird collections, but that's pretty normal and may be close enough to what they want (at least for the casts).

Ideally we'd make better use of something like part preservation - that should be sufficient for fossils, but won't necessarily distinguish eg human vs. machine observations.

"Observation", "Literature" and "unknown" are not valid values for basisOfRecord.

@tucotuco what ARE valid values - I couldn't find anything to save my life. There are "examples" in the DwC wiki, but no list of defined values.

"Recommended best practice is to use the standard label of one of the Darwin Core classes." The examples contain all the currently valid values, namely:
PreservedSpecimen, FossilSpecimen, LivingSpecimen, MaterialSample, Event, HumanObservation, MachineObservation, Taxon, Occurrence

@tucotuco are there definitions for these terms?

Yes, all of them. For example, https://dwc.tdwg.org/terms/#occurrence.
There are links to them all of them in the menu on the right side of that
page.

On Wed, Jan 8, 2020 at 3:26 PM Teresa Mayfield-Meyer <
[email protected]> wrote:

@tucotuco https://github.com/tucotuco are there definitions for these
terms?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/2432?email_source=notifications&email_token=AADQ7272ZXEETSXQKHQYO2TQ4YLGXA5CNFSM4KEA2VXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEINQIFY#issuecomment-572195863,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AADQ725PZDUK5J7ETAOAL4LQ4YLGXANCNFSM4KEA2VXA
.

I meant these terms:

PreservedSpecimen, FossilSpecimen, LivingSpecimen, MaterialSample, Event, HumanObservation, MachineObservation, Taxon, Occurrence

DOH! Thanks!

As an art collection, we would defer to the recommendations of the Getty Categories for the Description of Works of Art -- http://www.getty.edu/research/publications/electronic_publications/cdwa/1object.html#RTFToC2a According to the CDWA, catalog level is an indication of the level of cataloging represented by the record, based on the physical form or intellectual content of the material. Examples include: item, volume, album, group, subgroup, collection, series, set, multiples, component, box, fond, portfolio, suite, complex, object grouping, performance and items. We would primarily use item but in some cases another catalog level may be appropriate, such as series or group. Would item be an appropriate term to add to your list of cataloged item types, or is it too generic? If it’s too generic, we would probably still need to add a different term as I’m not sure any of the proposed ones here would work for an art collection. Also, I don’t think I understand the implications of adding new cataloged item types. How would this change things for cataloging and searching?

I just noticed that our specimens on GBIF are coming up as Preserved specimen instead of Fossil specimen. We need to find a solution for this.

I just noticed that our specimens on GBIF are coming up as Preserved specimen instead of Fossil specimen. We need to find a solution for this.

Same here - DMNS Marine Inverts. Where do you change BasisOfRecord?

Based on all of the discussion above, I think we still need the granularity of assigning basis of record by cataloged item and the way to do that should be through cataloged item type.

Ideally we'd make better use of something like part preservation - that should be sufficient for fossils, but won't necessarily distinguish eg human vs. machine observations.

Using "fossil" in preservation puts our basis of record for fossil material one step away from the place we should already have it - Cataloged_Item_Type where it would easily translate to DarwinCore and also provide better documentation for us. I really think we are under-utilizing this field and I suggest that we add the following terms and definitions:

Term | Definition
-- | --
living specimen | A biological specimen that is alive.
preserved specimen | A biological specimen that has been preserved.
fossil specimen | A preserved biological specimen that is a fossil.
human observation | An output of a human observation process.
machine observation | An output of a machine observation process.
item | An individual cultural object or work.

We could link these to the definitions provided by DwC or Getty.

This might also impact https://github.com/ArctosDB/arctos/issues/3164 but it might also help provide a basis for differing displays of catalog item types.

This looks reasonable . . .

On Fri, Jan 8, 2021 at 10:50 AM Teresa Mayfield-Meyer <
[email protected]> wrote:

  • [EXTERNAL]*

Based on all of the discussion above, I think we still need the
granularity of assigning basis of record by cataloged item and the way to
do that should be through cataloged item type.

Ideally we'd make better use of something like part preservation - that
should be sufficient for fossils, but won't necessarily distinguish eg
human vs. machine observations.

Using "fossil" in preservation puts our basis of record for fossil
material one step away from the place we should already have it -
Cataloged_Item_Type where it would easily translate to DarwinCore and also
provide better documentation for us. I really think we are under-utilizing
this field and I suggest that we add the following terms and definitions:
Term Definition
living specimen A biological specimen that is alive.
preserved specimen A biological specimen that has been preserved.
fossil specimen A preserved biological specimen that is a fossil.
human observation An output of a human observation process.
machine observation An output of a machine observation process.
item An individual cultural object or work.

We could link these to the definitions provided by DwC or Getty.

This might also impact #3164
https://github.com/ArctosDB/arctos/issues/3164 but it might also help
provide a basis for differing displays of catalog item types.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/2432#issuecomment-756904853,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBAPSP4MUYITP2RHQJ3SY5AWBANCNFSM4KEA2VXA
.

I suggest that we add the following terms and definitions

I am in favor of this

add the following terms and definitions:

I want to advocate for using the DWC terms, but in this case they're a little wonky for humans. If we just use "PreservedSpecimen" then the mapping to DW will be straightforward, new values won't require rebuilding code, users won't have to guess how we've translated, etc. - but it'll say "PreservedSpecimen" on records in Arctos.

If we go with eg "preserved specimen" then we do have to translate - keep our local definitions synced up with DWC, run code like below for export, etc.

I have no strong feelings, but I think it's worth discussion before we change anything.

    when CATALOGED_ITEM_TYPE='specimen' then 'PreservedSpecimen' 
    when CATALOGED_ITEM_TYPE='observation' then 'HumanObservation' 
    else null 
  end basisOfRecord,

By wonky, do you just mean the formatting of the terms, i.e. "PreservedSpecimen" vs "preserved specimen"?

Yes, just that, no functional implications.

Add default to manage collection but can be changed by adding the field to the bulkloader or changing it during data entry.

Type search field on main search page should search these terms.

@campmlc @dustymc @ccicero @ebraker @DerekSikes @mkoo @Nicole-Ridgwell-NMMNHS Please feel free to visit and comment on my submission at https://github.com/tdwg/dwc/issues/314

Looks good. We should also suggest moving the following to Preserved
Specimen:

A whole organism preserved in a collection.

On Fri, Jan 22, 2021, 7:47 AM Teresa Mayfield-Meyer <
[email protected]> wrote:

  • [EXTERNAL]*

@campmlc https://github.com/campmlc @dustymc
https://github.com/dustymc @ccicero https://github.com/ccicero
@ebraker https://github.com/ebraker @DerekSikes
https://github.com/DerekSikes @mkoo https://github.com/mkoo
@Nicole-Ridgwell-NMMNHS https://github.com/Nicole-Ridgwell-NMMNHS
Please feel free to visit and comment on my submission at tdwg/dwc#314
https://github.com/tdwg/dwc/issues/314

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/2432#issuecomment-765452846,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBAH3S7TTGPRMKXIBSDS3GFXPANCNFSM4KEA2VXA
.

Add default to manage collection

done

adding the field to the bulkloader

done

Type search field on main search page should search these terms.

confirmed

I think the remainder of this is for the code table folks.

Also needs documentation and announcement in Arctos Newsletter.

This all sounds like a good idea to me. Phyllis would appreciate being able to tag our fossil shells as 'fossil specimen'.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dustymc picture dustymc  Â·  3Comments

sharpphyl picture sharpphyl  Â·  7Comments

dustymc picture dustymc  Â·  7Comments

mkoo picture mkoo  Â·  3Comments

mgoliver picture mgoliver  Â·  7Comments