I've been normally using the Paleobiology Database Classification trees for editing my specimen classifications when available through Arctos. Most of the time there is an author associated with this

After cloning, however, this does not automatically fill in the "edit classification" screen


Is there a way to fix this @dustymc
Terms carried over are limited to https://arctos.database.museum/info/ctDocumentation.cfm?table=CTTAXON_TERM (and you should be getting a big red warning to that effect).
I might be able to guess that "whatever they use" means "our equivalent" - there's no standardization of terms from GN - but in this case I don't think we have a local equivalent, and I don't think I can reliably extract what we need from strings like that, so I suspect this would lead to the introduction of malformed data.
I would still prefer to deprecate that form and push all updates through the hierarchical editor, which produces consistent data.
I'm still unclear on the difference between this form and the hierarchical
editor. Do we have documentation that explains this a bit more? Or should
we schedule a taxonomy meeting/training?
On Thu, Aug 9, 2018 at 9:18 AM, dustymc notifications@github.com wrote:
Terms carried over are limited to https://arctos.database.
museum/info/ctDocumentation.cfm?table=CTTAXON_TERM (and you should be
getting a big red warning to that effect).I might be able to guess that "whatever they use" means "our equivalent" -
there's no standardization of terms from GN - but in this case I don't
think we have a local equivalent, and I don't think I can reliably extract
what we need from strings like that, so I suspect this would lead to the
introduction of malformed data.I would still prefer to deprecate that form and push all updates through
the hierarchical editor, which produces consistent data.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/1641#issuecomment-411794838,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AOH0hCIEmuRuPeIQ40oba-qMOfxeaCQFks5uPFLEgaJpZM4Vyfqu
.
documentation
I don't think we have a comparison.
taxonomy meeting/training
Yes!
difference
In short,
Hierarchical data are inherently normalized. A term occurs once, and each term has exactly zero-or-one parents.
pick one
is the only possible organization. That won't work for all collections and users (some folks NEED the non-hierarchical data - or at least they said they did so we built the structure to support it!) - but when it does work it's easy to be consistent (or impossible to be inconsistent), which leads to users finding what they're looking for.
It's also easier to manage. When "pick one" becomes "pick another" you just update the parent of Neotoma and all children (subfamily, species, subspecies, etc. - ALL of them!) automagically follow along.
The hierarchical tool is NOT so easy to set up. Inconsistencies in the data - like those inevitably introduced by the single-record tool - make import difficult, it's (very purposefully) not possible to edit single records, etc. If all local edits went through the hierarchical tool, those would mostly be one-time issues. As long as we have the single-record tool, those inconsistencies will continue to reappear.
I guess my only question was if we could automate the non-classification terms. In my example the name string contains "(Morton 1842)". If that cannot be pulled, I would have to manually enter the author_text, correct?
automate the non-classification terms.
I see no local equivalents to "name string," so I'm not sure what you're asking.
I think we should set this up as part of a larger discussion/training on
the pros/cons of the hierarchical editor vs. alternatives. I think we all
agree that there are usage problems with the current interface. Adding to
AWG agenda.
On Fri, Sep 7, 2018, 8:33 AM dustymc notifications@github.com wrote:
automate the non-classification terms.
I see no local equivalents to "name string," so I'm not sure what you're
asking.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/1641#issuecomment-419458672,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AOH0hJBAxyrwvWhFAx1EyiFd2zj3V7PRks5uYoPFgaJpZM4Vyfqu
.
Thanks, AWG discussion seems very useful. Perhaps we should extend an invitation to anyone who wants it?
More at https://github.com/ArctosDB/arctos/issues/1609
Here's my take:
I see a lot of negatives and few real positives to keeping the single-record editor around. Being able to quickly/easily add data of limited utility doesn't seem terribly useful to me. I'm not looking at the world from the perspective of a collection manager either.
There are currently 2,450,310 names in Arctos. 11,252 of them have no local classification data whatsoever, 183,429 have no term ranked "kingdom," 170,968 have no class term, and 88,856 have no order term. I have no idea what percentage of classifications are inconsistent, but it's significant. I don't think I've ever tried to use the hierarchical tool without finding a few outliers.
The data from GlobalNames provides a path to most (??) of the specimens which would otherwise be obscured by our messy local data, which I think makes this much less critical than it would be otherwise, but that also only works from one search field ("Any taxon") - it won't help a user who searches for eg, a genus (which will probably work) and a class (eg, to disambiguate homonyms) - there a decent chance at least a couple species in the genus won't have consistent class data and so would be excluded from that search.
I have a (crazy?) suggestion. You can create taxa downloads from PBDB. Dusty is correct that classifications are not necessary to identify specimens, perhaps an occasional list of names without classifications from UNM ES could be generated, then the classifications downloaded from PBDB and uploaded to Arctos. It is pretty easy to find the author text, so that could be uploaded along with the classification.
PBDB_test.txt
Actually, I don't think it would be too hard to arrange this download so that it would fit into the taxon bulkload tool....
You can include the author text. One sample from the attached list is:
"orig_no","taxon_no","record_type","flags","taxon_rank","taxon_name","taxon_attr","common_name","difference","accepted_no","accepted_rank","accepted_name","parent_no","ref_author","ref_pubyr","reference_no","is_extant","n_occs","phylum","class","order","family","genus","type_taxon"
"15403","15403","txn","B","genus","Placenticeras","Meek 1876","","","15403","genus","Placenticeras","61786","Sepkoski","2002","6930","extinct","206","Mollusca","Cephalopoda","Ammonitida","Placenticeratidae","Placenticeras",""
Doesn't seem so crazy to me. I've always been a fan of scooping up anything that looks like taxonomy for Arctos - that makes things easier for the folks who'll need those names, helps keep them consistent with similar or related names, gets them involved in our updates (https://github.com/ArctosDB/arctos/issues/1761), gives us a chance to find the garbage (and tune our garbage-filters), lets us add relationships that help users find specimens, etc.
Why not just grab everything? Arctos has a name-loader too.
I say go for it, but I will let others weigh in...
I don't know what the motion is on which you'd like others to weigh in.
-Derek
On Thu, Dec 13, 2018 at 2:43 PM Teresa Mayfield-Meyer <
[email protected]> wrote:
I say go for it, but I will let others weigh in...
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/1641#issuecomment-447161853,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AIraM6bCXbHJ1UsOdwWBRMgHCHIUw7jqks5u4uYpgaJpZM4Vyfqu
.
--
+++++++++++++++++++++++++++++++++++
Derek S. Sikes, Curator of Insects
Professor of Entomology
University of Alaska Museum
1962 Yukon Drive
Fairbanks, AK 99775-6960
phone: 907-474-6278
FAX: 907-474-5469
University of Alaska Museum - search 400,276 digitized arthropod records
http://arctos.database.museum/uam_ento_all
http://www.uaf.edu/museum/collections/ento/
+++++++++++++++++++++++++++++++++++
Interested in Alaskan Entomology? Join the Alaska Entomological
Society and / or sign up for the email listserv "Alaska Entomological
Network" at
http://www.akentsoc.org/contact_us http://www.akentsoc.org/contact.php
Grabbing everything from Paelobiology Database to create taxonomy in Arctos.
sounds good to me
On Fri, Dec 14, 2018 at 10:45 AM Teresa Mayfield-Meyer <
[email protected]> wrote:
Grabbing everything from Paelobiology Database to create taxonomy in
Arctos.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/1641#issuecomment-447434202,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AIraMxWX6gnMcW2V_SnuA_0gFj_Pwji6ks5u4__0gaJpZM4Vyfqu
.
--
+++++++++++++++++++++++++++++++++++
Derek S. Sikes, Curator of Insects
Professor of Entomology
University of Alaska Museum
1962 Yukon Drive
Fairbanks, AK 99775-6960
phone: 907-474-6278
FAX: 907-474-5469
University of Alaska Museum - search 400,276 digitized arthropod records
http://arctos.database.museum/uam_ento_all
http://www.uaf.edu/museum/collections/ento/
+++++++++++++++++++++++++++++++++++
Interested in Alaskan Entomology? Join the Alaska Entomological
Society and / or sign up for the email listserv "Alaska Entomological
Network" at
http://www.akentsoc.org/contact_us http://www.akentsoc.org/contact.php
@dustymc what do we need to do to make this happen?
I like the idea of grabbing all the taxonomy from the PBDB and pulling it into Arctos. It is a very well vetted resource.
I did a test run and it looks very helpful for our collection.
Anyone can do this; Arctos provides tools. I can smash buttons if nobody else wants to, but this would be better done by someone who intends to use the data.
Are there instructions somewhere?
Are there instructions somewhere?
Not really, but there should be.
HMMMM, I know how to do all that, I need to figure out how to get it out of PBDB....without it being one HUGE file.
It looks like there is a “download” option on the PBDB website. Maybe you can tweak that.
From: Teresa Mayfield-Meyer [mailto:[email protected]]
Sent: Monday, March 18, 2019 12:38 PM
To: ArctosDB/arctos
Cc: Prondzinski, Mary Beth; Assign
Subject: Re: [ArctosDB/arctos] Classification Cloning (#1641)
HMMMM, I know how to do all that, I need to figure out how to get it out of PBDB....
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHubhttps://github.com/ArctosDB/arctos/issues/1641#issuecomment-474024076, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ApNlkDKO834WEU_8z26HhR3HTIJrEoouks5vX872gaJpZM4Vyfqu.
Maybe I can help. Shall we discuss at Wednesday's meeting?
On the other hand, I see that the Paleobiology Database is already one of those that comes through Global Names. Are name just not uploaded to Arctos yet? I did have to copy one into WoRMS (via Arctos) this morning.

That functionality only allows me to update them one-by-one AND the names need to get into Arctos first. It's nice, but VERY time consuming....
It looks like there is a “download” option on the PBDB website. Maybe you can tweak that.
Problem is you can't download "everything". You have to enter some parameters and I don't know where to start. Which higher taxon will get me the most bang for the download buck?
What the majority of your collection? We could use the gastropods but there are 29,722 of them. Bivalves would be smaller with 19,979 if you need them.
Pretty sure @dperriguey could use both of those. I think @mbprondzinski and @Nicole-Ridgwell-NMMNHS will need vertebrate stuff - you guys have a higher taxon recommendation? I can make some guesses, but any help will be appreciated.
Arkansas State has a bivalve collection and MSB has a gastropod collection.
On Tue, Mar 19, 2019, 1:38 PM Teresa Mayfield-Meyer <
[email protected]> wrote:
Pretty sure @dperriguey https://github.com/dperriguey could use both of
those. I think @mbprondzinski https://github.com/mbprondzinski and
@Nicole-Ridgwell-NMMNHS https://github.com/Nicole-Ridgwell-NMMNHS will
need vertebrate stuff - you guys have a higher taxon recommendation? I can
make some guesses, but any help will be appreciated.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/1641#issuecomment-474486976,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AOH0hMOoruy7hNkzg-dBq_LJtTYxSv0bks5vYSB2gaJpZM4Vyfqu
.
These are mostly fossil taxa...
Try sharks: Class Chondrichthyes (not sure if I'm following the right thread here...)
Check! @mbprondzinski will do
BTW @sharpphyl you will benefit from the names I load, but NOT the classifications as they will be in the Arctos taxonomy source. Just one more reason I don't like the splitting taxonomy....
Did you contact them? I'd certainly rather deal with packaging up a CSV instead of someone flailing around on Arctos trying to do something like this, they might feel the same way.
Otherwise, do they support wildcard or NOT - can you search for % (or _ or whatever they use) or "something" + "NOT something"?
@dustymc this is why I asked for your help! I actually think we can use them just like we do WoRMS if I am reading this https://paleobiodb.org/data1.2/ correctly.
Do you think that is true? If so, I'll try to figure out who we need to contact to get the ball rolling.
Are they a worms-like resource - eg, could someone use them to manage a collection, and if so does someone want to use them to manage a collection? What's the goal here? API stuff aside, we could
I don't really see WoRMS-like functionality in their API docs, but maybe I'm just missing something.
I'm happy to assist however I can, but I'm not sure what we're trying to do.
I emailed them.
They use an identifier, but I don't hink we can use it to maintain our classifications - we'll see what they say. No one wants to step out of the Arctos source to use PBDB that I am aware of, but maybe someone is. What I would REALLY like to do just to help all these paleo collections out is get all of the PBDB names and their related classifications that aren't in Arctos INTO Arctos so that we aren't having to manually create thousands of names and classifications as these new collections go in. I tried using the scientific name checker on the bivalves (19K names) and it timed out. I don't know what the largest list of names I can check is, but I know that even at 1,000 per run it will still take me forever to get the bivalves, gastropods and sharks into Arctos, because once the names are in I'll have to load the classifications at about 500 at a time.
timed out
Email it - any anything else that's not happy in the forms - to the box address and I'll run it.
I got it to download 378 in Conus but they don't give any classification hierarchy so you'd have to build that in a csv which might be as time consuming as adding just what you need through Arctos. I'll play more with it and we can discuss today at our meeting.
You can download the hierarchy.
Emailed the bivalve name list to Dusty yesterday.
Emailed
Let me know when you've done that - I don't get notifications from box, it just magically appears there...
UAM@ARCTOS> select status, count(*) from temp_PBDB_Bivalve_Names group by status;
STATUS
------------------------------------------------------------------------------------------------------------------------
COUNT(*)
----------
Invalid characters.
3367
valid
11304
already got one
5308
Shall I create the 'valid'?
Check!, I'll work on creating the classification bulkload for those names.
@Jegelewicz meeting with Marh Uhen of PBDB on March 22.
Met with Mark. He says we can use PBDB in exactly the same way we use WoRMS.
Documentation for their API is at https://paleobiodb.org/data1.2/
The first step is to get all of their names and classifications into Arctos, along with the PBDB taxon_no
Bivalvia = https://paleobiodb.org/classic/checkTaxonInfo?taxon_no=16005
I suggest we get everything into Arctos and that for now we don't try to manage the automatic updates as we do with WoRMS. We can hold off until after we have made some decision about #1852
@dustymc is there really no way for you to handle this? Do we have to go taxon-by-taxon and grab this stuff?
Yea, sure, I'll smash buttons.
I got the PBDB data and was in the process of unwinding it into the classification loader, but that's losing a lot of data. Random example:
species = Microtus cautus
genus = Microtus
tribe = Arvicolini
family = Arvicolinae
family = Cricetidae
superfamily = Muroidea
infraorder = Myodonta
phylorder = Rodentia
phylorder = Glires
unranked clade = Euarchontoglires
unranked clade = Placentalia
subclass = Eutheria
infraclass = Tribosphenida
class = Mammalia
unranked clade = Mammaliaformes
unranked clade = Mammaliamorpha
unranked clade = Probainognathia
infraorder = Eucynodontia
unranked clade = Epicynodontia
family = Cynodontia
superorder = Therapsida
subclass = Synapsida
unranked clade = Amniota
suborder = Cotylosauria
subclass = Batrachosauria
unranked clade = Anthracosauria
unranked clade = Reptiliomorpha
subphylum = Tetrapoda
unranked clade = Tetrapodomorpha
subclass = Dipnotetrapodomorpha
unranked clade = Sarcopterygii
class = Osteichthyes
superclass = Gnathostomata
subphylum = Vertebrata
phylum = Chordata
unranked clade = Deuterostomia
unranked clade = Nephrozoa
unranked clade = Triploblastica
unranked clade = Animalia
unranked clade = Opisthokonta
kingdom = Eukaryota
unranked clade = Life
That'll fit in the core taxonomy tables no problem, but it's not very compatible with existing data (and I suspect that would preclude much possibility of review before loading).
If we find a way to allow unranked terms in the "Arctos" classification, it's still going to be extremely inconsistent with existing data - people looking for any of that stuff that's not in eg http://arctos.database.museum/name/Microtus are going to find stuff from PBDB and not existing stuff, kingdom=Animalia isn't going to find anything using PBDB data, etc.
I'm not sure why we disallowed unranked terms in local classification (someone asked for it, probably) but they would also be lost in the classification bulkloader (which uses ranks as headers, because people like spreadsheets and struggle with hierarchies).
The "best" approach might be to add all of that stuff to our existing data, but that would be a tremendous amount of work (and I'm not sure I'm ready to admit that I'm a fish quite yet).
Note that there are multiple terms with the same rank - that's in almost everything, and would do weird stuff to FLAT. (More evidence that we shouldn't try to smash complicated data into simple forms?)
I made that name and pulled from GN - http://arctos.database.museum/name/Microtus%20cautus#ThePaleobiologyDatabase. Looks like the PDB data will all magic in if I just create the names.
I suppose my inclination is to just ignore everything that doesn't fit nicely in a local classification, and create Arctos classifications when there isn't one for the names in PDB. I'm not very enthusiastic about that, and those of you working on making consistent data are probably less enthusiastic about cleaning up ~300K new messes.
Where should I go with this?
I'm gonna let this percolate a bit to see if something pops into my brain.
@dperriguey @Nicole-Ridgwell-NMMNHS @mbprondzinski @sharpphyl @KatherineLAnderson may all have better ideas.
/remind @Jegelewicz to check back on this by 04/05/19
@Jegelewicz set a reminder for Apr 5th 2019
I have a giant mess of data hanging around - please don't let this cook for too long.
Actually nevermind, let it cook as long as you need to, I'll just delete my mess and pull it again if necessary.
I added 290861 'valid' names so GN can start chewing on them, and they should be available for cataloging now.
FYI - the counts aren't perfect because there are some duplicates but fairly close.
UAM@ARCTOS> select name_status, count(*) from temp_pdbd group by name_status;
NAME_STATUS
------------------------------------------------------------------------------------------------------------------------
COUNT(*)
----------
Double spaces detected
2
valid
293117
already_got_one
65817
Invalid characters.
24726
"sp" is not a valid name-part
1
5 rows selected.
:wave: @Jegelewicz, check back on this by
/remind @Jegelewicz discuss with paleo people 04/16/2019
@Jegelewicz set a reminder for Apr 16th 2019
I made that name and pulled from GN - http://arctos.database.museum/name/Microtus%20cautus#ThePaleobiologyDatabase. Looks like the PDB data will all magic in if I just create the names.
I suppose my inclination is to just ignore everything that doesn't fit nicely in a local classification, and create Arctos classifications when there isn't one for the names in PDB. I'm not very enthusiastic about that, and those of you working on making consistent data are probably less enthusiastic about cleaning up ~300K new messes.
OK, so I have experienced this first hand as I added a bunch of classifications for OWU this week. The PBDB stuff isn't compatible with the usual Kingdom, Phylum... order of the taxonomic hierarchy and you are correct that it will make inconsistent classifications and probably hide specimens. Having the names in Arctos is a start - one less thing that needs to be done, but I don't know about those classifications. I'll put this back on the Taxonomy Committee agenda for discussion.
Also, @dperriguey @Nicole-Ridgwell-NMMNHS @KatherineLAnderson @mbprondzinski may have comments or suggestions.
:wave: @Jegelewicz, discuss with paleo people
What about the idea of "add all of that stuff to our existing data"? This could be done for vertebrates, anyway, at least for some of the taxonomic ranks. Not a solution, but maybe make things a bit more discoverable across time scales and collection types.
@campmlc I think it would be useful, but compare eg
species = Microtus cautus
genus = Microtus
tribe = Arvicolini
family = Arvicolinae
family = Cricetidae
superfamily = Muroidea
infraorder = Myodonta
phylorder = Rodentia
phylorder = Glires
unranked clade = Euarchontoglires
unranked clade = Placentalia
subclass = Eutheria
infraclass = Tribosphenida
class = Mammalia
unranked clade = Mammaliaformes
unranked clade = Mammaliamorpha
unranked clade = Probainognathia
infraorder = Eucynodontia
unranked clade = Epicynodontia
family = Cynodontia
superorder = Therapsida
subclass = Synapsida
unranked clade = Amniota
suborder = Cotylosauria
subclass = Batrachosauria
unranked clade = Anthracosauria
unranked clade = Reptiliomorpha
subphylum = Tetrapoda
unranked clade = Tetrapodomorpha
subclass = Dipnotetrapodomorpha
unranked clade = Sarcopterygii
class = Osteichthyes
superclass = Gnathostomata
subphylum = Vertebrata
phylum = Chordata
unranked clade = Deuterostomia
unranked clade = Nephrozoa
unranked clade = Triploblastica
unranked clade = Animalia
unranked clade = Opisthokonta
kingdom = Eukaryota
unranked clade = Life
and https://arctos.database.museum/name/Microtus#Arctos
I think a person would have to go through the existing ~3 million records, make sure they're not adding Arvicolinae to Microtus-the-treefern, etc.
And there's the whole "not sure I'm ready to admit that I'm a fish" thing - do we really want searches for "Osteichthyes" turning up mice? Maybe we should, but it would still overwhelm some users, make it difficult to discover what most think of as fish, etc.
We are all fish...
I say we close this - there doesn't seem to be an easy way to bring PBDB classifications into Arctos.
Can we revisit this? Now that we can have multiple sources can we add PBDB as a new, externally maintained, automatically updated source? I think this would really help the taxonomy for our collection.
revisit
Sure - there's no longer any reason to try to make it consistent with anything else.
automatically updated
So is this just a request to allow Sources from GlobalNames to be preferred by collections?
So is this just a request to allow Sources from GlobalNames to be preferred by collections?
Yes, I suppose so.
So is this just a request to allow Sources from GlobalNames to be preferred by collections?
This would be a good demo of that.
Most helpful comment
I say we close this - there doesn't seem to be an easy way to bring PBDB classifications into Arctos.