Ref: https://github.com/ArctosDB/arctos/issues/1852#issuecomment-532857238
Add taxon concepts as an enhancement to, rather than a replacement for, the current identification-->taxon (by way of preferred classification) pathway.
For the initial purposes of this project, taxon concepts are defined as the intersection of names and publications (plus relationships between concepts).
There will be a new field, which can be ignored, in identifications
There will be a new field which can reference taxon concepts from identifications, and some way to select concepts for it.
Is this fundamentally different than id_sensu (identification.publication_id FKEY-->publication)? Both involve publications and taxa, albeit less-directly in the case of id_sensu. Perhaps id_sensu can eventually be merged into taxon_concepts; build them parallel to test the idea, don't lose sight of this.
Failing merge, we need to develop disambiguating documentation.
guid_prefix,
count(distinct(identification.collection_object_id)) numberSpecimens,
count(*) numberIdentifications
from
collection,
cataloged_item,
identification
where
collection.collection_id=cataloged_item.collection_id and
cataloged_item.collection_object_id=identification.collection_object_id and
identification.publication_id is not null
group by
guid_prefix
order by
guid_prefix
17 ;
GUID_PREFIX NUMBERSPECIMENS NUMBERIDENTIFICATIONS
------------------------------------------------------------ --------------- ---------------------
ALMNH:ES 4 5
CHAS:Bird 1 1
CHAS:Ento 4 4
CHAS:Mamm 17 17
DGR:Bird 6 6
DGR:Mamm 1 1
DMNS:Bird 17 26
DMNS:Mamm 127 130
KNWR:Ento 250 283
KWP:Ento 96 96
MLZ:Bird 1134 1231
MSB:Bird 10 10
MSB:Fish 3 3
MSB:Host 24 24
MSB:Mamm 4830 6010
MSB:Para 282 297
MVZ:Bird 308 308
MVZ:Herp 3604 3724
MVZ:Hild 1 1
MVZ:Mamm 3014 3276
MVZObs:Herp 1 1
UAM:Bird 153 156
UAM:ES 749 811
UAM:Ento 5153 5154
UAM:Fish 1 1
UAM:Herb 416 416
UAM:Herp 3 3
UAM:Inv 15 15
UAM:Mamm 6027 7217
UAMObs:Ento 1715 1716
UAMObs:Mamm 7 7
UAMb:Herb 5 5
UCM:Herp 2 2
UMNH:Mamm 171 171
UTEP:ES 856 897
UTEP:Ento 21 21
UTEP:Herp 369 464
UTEP:HerpOS 3 3
UTEP:Mamm 61 62
39 rows selected.
Great summary @dustymc, thanks. Here is a sketch of my thoughts for the the additional two tables and the link to ID. The asterisk indicates additions.
=======================
identification
-----------------------
=================== id (PK)
taxon_concept * agent_id (FK) ---->
------------------- name_id (FK) ---->
id (PK) <============+==+---- taxon_concept_id * (FK)
publication_id (FK) ----> | | =======================
name_id (FK) ----> | |
=================== | |
| |
^ ^
======================= | |
taxon_concept_rel * | |
-------------------- | |
id (PK) | |
from_tc_id (FK) ----+ |
to_tc_id (FK) -------+
relationship
according_to_pub_id (FK) ---->
========================
The taxon_concept_rel.relationship field could be an enum that includes set relationships: ‘includes’, ‘is included in’, ‘is congruent with’, ‘overlaps’, ‘intersects’, ‘is disjunct with’. Another field could possibly be created with the more vague synonymy terms: ‘is synonym of’, ‘is pro parte synonym of’, ‘is heterotypic synonym of’, ‘is homotypic synonym of’, etc.
See here for ongoing discussion of these terms by the @tdwg Taxon Names and Concepts group. The group is creating a new taxonomic names and usages standard, based on the older Taxon Concept Transfer Schema.
See here for an example of the kind of data that could go in the taxon_concept and taxon_concept_rel tables.
@camwebb do you have existing data or is that something that'll be created in Arctos?
If you do have data could you pass it along? If possible, building this around real data will probably stop some problems before they come to exist. I can provide a transfer site if it can't be attached here or emailed.
The model seems right, and I think the relationships (possibly in conjunction with a 'local' publication) could be used to merge "mini-concepts" into more complex concepts ("these 50 pubs all agree....").
I will need definitions for the relationship terms. I think they'll all be in one code table, and it'll be up to the user to use them properly (eg, not use something vague when you know something specific). Reasonable?
@dustymc At first we'll be generating data locally in our DB, and importing it into Arctos (pre-aligned with the names in Arctos). The best test data would be from the example above. I've substituted numeric keys and excluded all the names that don't match 'Arctos plants names' and pasted the 4 tables as plain text here. Let me know if this will not work and I'll wrangle it into another format.
@dustymc any progress?
Yea I've been playing around a bit when I can't stand to look at PG triggers and such any longer - I think the basics are more or less there, but I need some arctosified data to really know. http://arctos.database.museum/name/Claytonia%20koliana should now have some new toys that you can play with.
@dustymc Great to see this test/demo!
On the Taxonomy Committee phone call just now I discovered that in Arctos there is not a single occurrence of a name + author_string combination. That combination potentially lives in may places in the classification table, no? So there's not yet a name+author_string_id to link to, as per the diagram above.
It seems from the demo that you've dealt with this by using the name + original_publication as a 'place marker' in place of name + author_string and name + later_publication as place marker for subsequent taxon concepts dealing with the same original name. Seems like a good solution given the other tables, but... I'm not sure it'll do what we need it to do. The author strings in botany can get quite complex (e.g., Pulsatilla dahurica (Fisch. ex DC.) Spreng.) and can't be simply captured by a place marker publication (e.g., the Spreng. publication in this example). Also, the data won't come in this publication-centric form. We need a way to refer to the author string.
My suggestion is to form the concept out of a triplet: name_id, publication_id, "author_string". What do you think?
lives in may places in the classification table
For not-us data, yes. For local data, author + infraspecific rank auth + name should all be rolled up in display_name (which is autogenerated and Code-aware) when those data are available. http://arctos.database.museum/name/Pulsatilla%20dahurica#ArctosPlants
Yes I added "concept label" as a band-aid to display what I think you want to display. It could be pulled (and maintained, I think) from the collection's preferred classification?
What exactly do you mean by "publication-centric form"?
This is essentially implemented as a triplet (with a pkey)
UAM@ARCTOS> desc taxon_concept;
Name Null? Type
----------------------------------------------------------------- -------- --------------------------------------------
TAXON_CONCEPT_ID NOT NULL NUMBER
TAXON_NAME_ID NOT NULL NUMBER
PUBLICATION_ID NOT NULL NUMBER
CONCEPT_LABEL NOT NULL VARCHAR2(255)
it's easy enough to relabel (or change the table at this point) CONCEPT_LABEL->author_string if that does what you're suggesting.
What exactly do you mean by "publication-centric form"?
Moot point now, since I see what you are doing, but to be clear, I think the data will be coming in in this form:
1. Concept = tmp_concept_id (int) +
name ( = arctos_name_id (int) + author_string (string) ) +
publication (long string)
2. ConceptRelationship = from_tmp_concept_id (int) +
to_tmp_concept_id (int) +
relationship (string) +
according_to_publication (long string)
I like the solution you give above. Are you storing the complete name+author_string in CONCEPT_LABEL or just the author_string? CONCEPT_LABEL is a good choice of name.
The hard part of importing seems to be: 1) matching the incoming author_string to any existing display_name in Arctos plants (many will not match exactly), and 2) matching incoming publication strings to existing publication records in Arctos.
CONCEPT_LABEL
It's just free-text for now. I can probably do more if it makes something easier.
matching the incoming author_string to any existing display_name in Arctos plants (many will not match exactly),
Yes! It would be at least another proposal's work, but at some point we might consider hooking into Agents so we don't have to deal exclusively with strings.
matching incoming publication strings
Yea that's going to be sort of a nuisance too. Recent (and increasingly not-so-recent) publications should have DOIs so we might find some magic in Crossref's API or similar. Older publications are probably going to be a little ugly, but I think that's OK too - the cleanup has to start somewhere.
@ArctosDB/taxonomy asks: possible to relate concepts to classifications when they are different?
Hu? "Concepts" as currently defined in Arctos are the intersection of classifications and publications.
As discussed to on the call today, Dusty's additions for Taxon Concepts are working well. See this example for Claytonia arctica - scroll down to Concepts.
I asked Dusty to add the link from Identification to Concept (as a new field in Identification). Please weigh in here if you have any reservations.
Cam,
Nice !
But...
Claytonia arctica Adams sunsu Yurtsev 1981
What is sunsu? shouldn't that be sensu?
-D
On Wed, Feb 19, 2020 at 3:39 PM Cam Webb notifications@github.com wrote:
As discussed to on the call today, Dusty's additions for Taxon Concepts
are working well. See this example for Claytonia arctica
http://arctos.database.museum/name/Claytonia%20arctica - scroll down to
Concepts.I asked Dusty to add the link from Identification to Concept (as a new
field in Identification). Please weigh in here if you have any reservations.—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/2267?email_source=notifications&email_token=ACFNUMYLKNKARPWVUSVRR7DRDXGN3A5CNFSM4IYD57DKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMKI7CI#issuecomment-588550025,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ACFNUM4JCA545PIHXFENNW3RDXGN3ANCNFSM4IYD57DA
.
--
+++++++++++++++++++++++++++++++++++
Derek S. Sikes, Curator of Insects
Professor of Entomology
University of Alaska Museum
1962 Yukon Drive
Fairbanks, AK 99775-6960
phone: 907-474-6278
FAX: 907-474-5469
University of Alaska Museum - search 400,276 digitized arthropod records
http://arctos.database.museum/uam_ento_all
http://www.uaf.edu/museum/collections/ento/
+++++++++++++++++++++++++++++++++++
Interested in Alaskan Entomology? Join the Alaska Entomological
Society and / or sign up for the email listserv "Alaska Entomological
Network" at
http://www.akentsoc.org/contact_us
Oops - typo. But I guess this shows up a minor limitation. The concept label is an HTML string that the user creates. E.g., <i>Claytonia arctica</i> Adams <i>sensu</i> Porsild 1974. So prone to errors.
Identification
If it looks like the core model is working I'll go ahead and add that to PG with the other ID changes I need to make. If necessary we can talk about patching it back in to production - hopefully we'll be in PG soon and won't have to....
typo
I'm up for clever ideas, both on the name/label itself and how things get displayed/work/whatever. The most obvious generated "concept name" - <i>Claytonia arctica</i> Adams <i>sensu</i> Porsild 1974 - isn't necessarily unique, and I'm running under the vague idea that "labels" would be cleverly named by Curators. I don't know how that'll line up with reality.
Can we force taxon concepts to be entered into a template form with the
and sensu and spaces pre- filled? That will at least eliminate those
potential typos. And then pull the name and pub from an Arctos drop-down as
in data entry?
On Wed, Feb 19, 2020, 11:16 PM dustymc notifications@github.com wrote:
- UNM-IT Warning:* This message was sent from outside of the LoboMail
system. Do not click on links or open attachments unless you are sure the
content is safe. (2.3)Identification
If it looks like the core model is working I'll go ahead and add that to
PG with the other ID changes I need to make. If necessary we can talk about
patching it back in to production - hopefully we'll be in PG soon and won't
have to....typo
I'm up for clever ideas, both on the name/label itself and how things get
displayed/work/whatever. The most obvious generated "concept name" - Claytonia
arctica Adams sensu Porsild 1974 - isn't necessarily unique,
and I'm running under the vague idea that "labels" would be cleverly named
by Curators. I don't know how that'll line up with reality.—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/2267?email_source=notifications&email_token=ADQ7JBA4TDB27D3IBHBBOALRDYG2TA5CNFSM4IYD57DKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMKYS7A#issuecomment-588614012,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBC656C6VTCT2752ML3RDYG2TANCNFSM4IYD57DA
.
m with the and sensu and spaces pre- filled?
We can do WHATEVER, including not use that format, or not use any consistent format, or not use any label at all, or ....
pull the name and pub
Those are data objects, they can't be a problem here.
I notice a lot of inconsistencies in the name "Adams".
pull the name
a lot of inconsistencies in the name
Err - maybe I'm lost. The taxon name is a data object and so easy/unambiguous/etc. The author name is a string. I could do magic with author_text, but...
name "Adams".
It's a string. Inconsistency is what strings do. If that's undesirable, string is the wrong datatype. We certainly don't have the resources to do anything about that, other than temper our expectations....
Unless we create all authorities as agents . . .
On Thu, Feb 20, 2020, 10:03 AM dustymc notifications@github.com wrote:
- UNM-IT Warning:* This message was sent from outside of the LoboMail
system. Do not click on links or open attachments unless you are sure the
content is safe. (2.3)pull the name
a lot of inconsistencies in the nameErr - maybe I'm lost. The taxon name is a data object and so
easy/unambiguous/etc. The author name is a string. I could do magic with
author_text, but...name "Adams".
It's a string. Inconsistency is what strings do. If that's undesirable,
string is the wrong datatype. We certainly don't have the resources to do
anything about that, other than temper our expectations....—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/2267?email_source=notifications&email_token=ADQ7JBAOZHVFR5X4LDFTKADRD2STFA5CNFSM4IYD57DKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMO4VVA#issuecomment-589155028,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBHW4EJFIO6DFISAY7TRD2STFANCNFSM4IYD57DA
.
authorities as agents
https://github.com/ArctosDB/arctos/issues/2267#issuecomment-542927279
I've been hoping to shame someone who's had the $$ (and taxonomic focus) into doing that for quite some time. Maybe that's the wrong outlook and we should write a grant to do something ourselves. I'm still not crazy about the idea of inadvertently becoming a "taxonomic authority!"
Back to concepts, I can't add the ID link (=change structure) without breaking my migration scripts, so I'll hold off until directed otherwise and hope we quickly find a way to get to PG.
@campmlc: Can we force taxon concepts to be entered into a template form with the
<i>and sensu and spaces pre- filled?
I agree here. The taxon name in the string could be auto-filled from the taxon name from the page, and the _sensu_ publication could be filled from the selected publication. This leaves only the author string 'free'. I think it's important to leave this free and _not_ import it from (and link it to) a classification, because there may be obscure authors that don't appear in any classification.
So the fields for a concept record would be, e.g., concept_id, taxon_name_id, author_string, publication_id, from which a generated label would be displayed: <i>taxon_name</i> author_string <i>sensu</i> publication
@dustymc: Back to concepts, I can't add the ID link (=change structure) without breaking my migration scripts, so I'll hold off until directed otherwise and hope we quickly find a way to get to PG.
Sounds good. No hurry.
I'm still not 100% comfortable with the idea that labels can be fully autogenerated - we can easily revisit that if I'm wrong - so I...
Hopefully that'll add up to two clicks and a publication pick most of the time.
Again, happy to revisit any or all of that if I'm being too paranoid.
Added taxon_concept_id references taxon_concept(taxon_concept_id) to identification
Rebuilt editidentification to use concepts
Added concepts to specimendetail
added concepts to "previousidentifications"
@dustymc I tried out the Taxon Concept Creator, and the Identify to Taxon Concept and they work well. Thank you! I think we now have full functionality to record any imported TCs and TCRels from our Flora of Alaska project, and to edit/manage them in Arctos.
Would you like me to edit the user manual to reflect this new functionality?
Please, and YAY!
@dustymc Finally made some edits to the Documentation wiki, but I seem to have lost write access to the Github repo. Could you please re-authorize me. Thanks.
@mkoo help?
@dustymc, @mkoo... ping :-)
@camwebb can you access the docs now?
@dustymc Yup. Changes pushed and appearing in handbook. Thanks.