Arctos: collection "standardization"

Created on 27 Aug 2018  Â·  29Comments  Â·  Source: ArctosDB/arctos

We "standardized" collection to make this thing...

screen shot 2018-08-27 at 12 26 51 pm

more readable. Our "standardized" data aren't.

UAM@ARCTOS>  select distinct collection from collection order by collection;

COLLECTION
------------------------------------------------------------------------------------------------------------------------
Algae specimens (ALA)
Amphibian and reptile observations
Amphibian and reptile osteology specimens
Amphibian and reptile specimens
Anatomical preparations
Archeology
Art
Arthropod tissues
Bird Observations
Bird eggs
Bird eggs/nests
Bird observations
Bird specimens
Bird tissues
Cryptogam specimens (ALA)
Earth Science
Environmental samples
Ethnology and History artifacts
Ethnology and History observations
Fish observations
Fish specimens
Host (of parasite) specimens
Insect observations
Insect specimens
Invertebrate specimens
Lepidopteran specimens
Mammal observations
Mammal specimens
Mammal tissues
Marine invertebrate specimens
Mollusc specimens
PSU Mamm
Parasite specimens
Plant specimens
Plant specimens (ALA)
Reptile specimens
Teaching specimens
Vertebrate observations
Zooplankton specimens

39 rows selected.

Can we do better?

Most helpful comment

I think a standardized code table makes sense for this.

All 29 comments

Bumping priority. We're creating new collections with new (and creative!) vocabulary. We need to either drop this idea, or standardize the vocabulary as a code table which can be accessed from the new collections creation form.

@lkvoong

What needs to be done. I am all for making these standard, but we probably need a discussion about what those standards should be for each collection type. AWG tomorrow would be good for that...add it to the agenda if you agree @dustymc

@dustymc Wow, this is an old issue! Let's prioritize.

https://github.com/ArctosDB/arctos/issues/772 - I'm not sure we NEED to remove 'specimen' from there (it's data, not UI) but this is a good opportunity if someone/everyone wants to.

We need to immediately

  1. clean and standardize, or
  2. decide we're going to accept whatever gets typed into the request form

Strongly suggest we don't accept additional collection creation requests until this is resolved.

arctosprod@arctos>> select distinct collection from collection order by collection;
                collection                 
-------------------------------------------
 Algae specimens
 Algae specimens (ALA)
 Amphibian and reptile observations
 Amphibian and reptile osteology specimens
 Amphibian and reptile specimens
 Amphibian and Reptile specimens
 Amphibian specimens
 Anatomical preparations
 Aquatic macroinvertebrate specimens
 Archaeology
 Archeology
 Art
 Arthropod tissues
 Bird eggs
 Bird eggs/nests
 Bird observations
 Bird Observations
 Bird specimens
 Bird tissues
 Bivalve specimens
 Cryptogam specimens (ALA)
 Earth Science
 Egg and nest specimens
 Egg specimens
 Entomology specimens
 Environmental samples
 Ethnology
 Ethnology and History artifacts
 Ethnology and History observations
 Fish observations
 Fish specimens
 Fossil specimens
 Geology specimens
 Herbarium
 Herbarium specimens
 History and Ethnology
 Host (of parasite) specimens
 Insect observations
 Insect specimens
 Invertebrate specimens
 Invertebrate Zoology
 Lepidopteran specimens
 Mammal observations
 Mammal specimens
 Mammal tissues
 Marine invertebrate specimens
 Mollusc specimens
 Paleontology specimens
 Parasite specimens
 Plant observations
 Plant specimens
 Plant specimens (ALA)
 PSU Mamm
 Reptile specimens
 Teaching and Education specimens
 Teaching specimens
 Vertebrate observations
 Zooplankton specimens

Could we create a drop down with options (cleaned up to eliminate mis-spellings, different capitalizations, etc.), and a 'remarks' field if someone wants something different?

So if a mammal collection, choose "Mammal specimens" rather than type it in?

Can you put values into a Google doc so we can standardize? Doesn't look too difficult to do that. Maybe a top agenda item for our code table meeting on Thursday @Jegelewicz?

There's no relevant remarks field - we can standardize (eg, build a code table) or just take what comes.

Here's CSV.

temp_itsamess.csv.zip

I'm curious as to why this matters. Other than provide a super-brief description of the collection on the portal page, where else is it used? Does it really NEED to be standardized?

In any case, I have created a Google Doc for discussion tomorrow, but I'd really like to know who cares and why.

why this matters.

All linked above, I believe. No it doesn't NEED to be standard, but trying (not very successfully) to standardize it has become a significant time-consuming part of creating collections. "Just take whatever" is an acceptable solution to us. Standardization is acceptable. There are functional implications, as described in linked issues.

I think a standardized code table makes sense for this.

See updated Google Doc. Bring to AWG issues for approval.

@Jegelewicz AWG or the Issues Meeting? Probably AWG since it needs full approval?

Yes, full AWG. Thanks.

On Thu, Sep 17, 2020, 3:57 PM Elizabeth Wommack notifications@github.com
wrote:

@Jegelewicz https://github.com/Jegelewicz AWG or the Issues Meeting?
Probably AWG since it needs full approval?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/1662#issuecomment-694543664,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAHME2ZXDQ6PMA6MN5LMS5TSGKH5RANCNFSM4FRZOJLA
.

Everything is used - it was pulled directly from table collection.

UAM:ES is fossil and geology (and gems, and ....).

Isn't everything "education" at some level? Not sure that's necessary in "teaching."

I dislike the specimen/observation distinction, but we may be stuck with it anyway. Some collections catalog photos as "specimens", others catalog ear snips and "observations" - it's a mostly-arbitrary distinction as far as the contents of the collection are concerned, but it may have other meaning (less curation??).

Yes I'm sure entomology!=invert (maybe unless we're also smooshing birds into "verts"...). Big-picture I don't see much need to force anything, if two collections are slightly different there's probably no reason to dispute that.

If this is headed towards becoming a code table then it will need definitions - that might help flesh out the terminology.

Here's a different view.

temp_gpc.csv.zip

Who is using the three that are 'tissues'

Arthropod tissues
Bird tissues
Mammal tissues

We didn't see that on the portal page.

We have a good definition of 'observation' in specimen_event_type (bold is mine): Specimen was detected and not killed or removed from context; No biological samples were taken. Human sightings, camera traps, and GPS telemetry data are appropriate here.

If it's an occurrence record with no physical material collected and accessioned, then it's an observation.

Who is using

It's in the CSV - DGR at least

portal page.

Not all are public

'observation' in specimen_event_type

That's an entirely different thing; these are administrative entities, SpecEvent is a determination. We can't (and wouldn't if we could, collections really are arbitrary and administrative) force MVZ to recatalog https://arctos.database.museum/guid/MVZObs:Mamm:12, or disallow someone cataloging a photo in a "real" collection.

If it's an occurrence record with no physical material collected and accessioned, then it's an observation.

That does not dictate into what collection that event is recorded.

I thought DGR was obsolete ? In any case, tissues are still specimens so can e.g., 'Bird tissues' be changed to 'Bird specimens' @campmlc ?

Yep. Still using DGR as a cm tool. May need to revive it for real,
actually. But ok with the language change.

On Fri, Sep 18, 2020, 10:49 AM Carla Cicero notifications@github.com
wrote:

  • [EXTERNAL]*

I thought DGR was obsolete ? In any case, tissues are still specimens so
can e.g., 'Bird tissues' be changed to 'Bird specimens' @campmlc
https://github.com/campmlc ?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/1662#issuecomment-694973744,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBGSWLHG4KD6NSYAZ7TSGOFS3ANCNFSM4FRZOJLA
.

Added a place holder with GitHub issue link, and @Jegelewicz Google Doc link to the Agenda draft for next week.

OK, I changed Arthropod tissues, Bird tissues, and Mammal tissues to Arthropod specimens, Bird specimens, and Mammal specimens. @campmlc are all three of those DGR?

Possibly? We have a DGR Ento Bird and Mamm

On Fri, Sep 18, 2020, 12:22 PM Carla Cicero notifications@github.com
wrote:

  • [EXTERNAL]*

OK, I changed Arthropod tissues, Bird tissues, and Mammal tissues to
Arthropod specimens, Bird specimens, and Mammal specimens. @campmlc
https://github.com/campmlc are all three of those DGR?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/1662#issuecomment-695017216,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBDITTWPFMYMRB3REWTSGOQMPANCNFSM4FRZOJLA
.

I dislike the specimen/observation distinction, but we may be stuck with it anyway. Some collections catalog photos as "specimens", others catalog ear snips and "observations" - it's a mostly-arbitrary distinction as far as the contents of the collection are concerned, but it may have other meaning (less curation??).

Me too, but then some people catalog herps in a mammal collection, so the whole thing seems not-so-great?

This is pertinent?

https://www.tdwg.org/conferences/2020/working-sessions/#itg03:%20collections%20descriptions%20task%20group

Issues meeting:

  • no code table
  • pull existing values into pre_collection form

AWG to work on clean up with individual collections (capitalization etc.)

And we will create collections with whatever someone types in there.

There's a shined-up collection request form in test.

  • The documentation is a little harder to avoid.
  • Collection, institution, and institution_acronym can be selected from existing values

Todo:
make sure right and left are in same order
onclick: highlight stuff to right when left is clicked

add something about grbio in pre-form check - @campmlc @mkoo need verbiage

Form is reordered and highlighting.

I still have no idea what grbio can do for us....

Was this page helpful?
0 / 5 - 0 ratings

Related issues

AJLinn picture AJLinn  Â·  3Comments

dustymc picture dustymc  Â·  7Comments

DerekSikes picture DerekSikes  Â·  3Comments

ebraker picture ebraker  Â·  8Comments

Jegelewicz picture Jegelewicz  Â·  7Comments