Arctos: Locality Attributes (II)

Created on 13 Feb 2020  ·  143Comments  ·  Source: ArctosDB/arctos

This is meant to serve as a distraction-free implementation issue of Locality Attributes.

Very general (from https://github.com/ArctosDB/arctos/issues/2274): does anyone see a reason NOT to build locality_attributes much like all other Attributes?

First-pass would be home for the not-geology stuff we've been stuffing in geology_attributes.

It would save a ton of work if we could generalize geology to locality_attributes in the same pass. I think the biggest potential "problem" with that is summarized in https://github.com/ArctosDB/arctos/issues/2479: the hierarchical code tables seem to cause more problems than they solve, can we drop that idea entirely, make simple assertions ("formation=Prince Creek"), and (eventually) use something like the webservices behind https://macrostrat.org/sift/#/strat_name_concept/11635 to pull the metadata?

This would make https://github.com/ArctosDB/arctos/issues/2243 irrelevant.

Eventually-maybe: consider moving some non-spatial stuff out of locality to attributes. https://github.com/ArctosDB/arctos/issues/2274#issuecomment-534721038. That does NOT need discussed here, but shouldn't be forgotten either.

Semi-related: https://github.com/ArctosDB/arctos/issues/2296

Function-CodeTables Function-LocalitEvenGeoreferencing Priority-Critical

All 143 comments

It would save a ton of work if we could generalize geology to locality_attributes in the same pass. I think the biggest potential "problem" with that is summarized in #2479: the hierarchical code tables seem to cause more problems than they solve, can we drop that idea entirely, make simple assertions ("formation=Prince Creek"), and (eventually) use something like the webservices behind https://macrostrat.org/sift/#/strat_name_concept/11635 to pull the metadata?

Here is a brief response.

Chronostratigraphy - these are actually dated and in the form of a hierarchy. It would be really nice if these terms included their date ranges and we made use of that in Arctos. If we could do that, we wouldn't need a hierarchy, we could let the dates do the "traverse hierarchy" work for us.

Lithostratigraphy - I defer to your opinion here. Because these strata are not in a single hierarchy, we should include any lithostrata necessary to facilitate search. Before we dismantle the current "hierarchy, we should ensure that the terms are applied to all localities using any term below other terms in a current hierarchy. That being said, application of MacroStrat webservices would be awesome!

Biostratigraphy - I'm not sure how this hierarchy should be handled. Needs input from @Nicole-Ridgwell-NMMNHS

Eventually-maybe: consider moving some non-spatial stuff out of locality to attributes. #2274 (comment). That does NOT need discussed here, but shouldn't be forgotten either.

So Higher Geogrpahy, Specific Locality, Remarks, Elevation, Depth? Is that what you are talking about? Just want to make sure I understand...

Finally - suggest that UTM should be moved from a type of coordinate (that doesn't do anything) in Specimen Event to locality attribute.

Chronostratigraphy - these are actually dated

I think the fundamental problem with the hierarchy is that we're mixing data (formation etc.) and metadata derived from that data (Era etc.). It turns out that's not as clear as the model was originally designed for - rock units span (or are otherwise ambiguously linked to) time units, etc. I'm hesitant to reintroduce any of that in any form.

That said I don't REALLY know how this works - if "Some Time Thing" is statically defined as "T1-T2" or something then there's no confounding and time could be embedded in the terminology.

If not, I'd prefer to see that done in two ways.

  • Some Rock Thing (agent, date, method)
  • Some Time Thing (agent, date, reference-or-whatever)
  • Some Numeric Time Thing (agent, date, reference-or-whatever)

and/or

  • Some Rock Thing

    • various explicitly non-asserted data gathered up from webservices or whatever

facilitate search

That's where Macrostrat comes in. Rather than (sporadically, when someone correctly maintains the authority, and perhaps while failing to adequately distinguish data and metadata) saying "Bla Formation is a child of Blah Group" (which seems to be how the "taxonomy" is designed, but not necessarily how the "identifications" are used) we'd just say "Bla Formation" and pull group (along with chronological units, time, etc.) from some webservice, which can accommodate whatever weird reality exists.

Higher Geogrpahy

That'll stay where it is.

Specific Locality

Interesting idea! Maybe, if you can fend off the Curators with pitchforks....

Remarks

Definitely.

Elevation

That's spatial (assuming we're all more or less referencing MSL and not the local surface). We might consider moving min/max/units to Attributes and converting locality to meters though.

Depth

I believe so - that's a reference to a local surface, not spatial data.

UTM

It's in collecting_event with all other verbatim data, including coordinates.

that doesn't do anything

Hopefully PG will fix that.

I don't see any reason we can't continue to use a hierarchy structure for chronostratigraphy. I'd rather use the hierarchy structure than the dates because the dates can change slightly (new, better dating method for a GSSP and the date changes by .1 million years). The hierarchy structure of the chronostratigraphic units is more stable than the dates.

Pulling from the Macrostrat API for 'hierarchy' information does seem a better solution than maintaining our own hierarchy.

Agree with @Nicole-Ridgwell-NMMNHS for chronostratigraphy a hierarchy makes sense and that if someone else can maintain the hierarchy, even better. For the other geology stuff I think treating it like all other attributes makes sense.

ref: https://github.com/ArctosDB/PG-migration-testing/issues/85#issue-602033792

make sure type/value search suggestions are linked when controlled data is involved.

I'd like to make a start at pruning the geology attributes table.

Step one - all Site Found Date don't need to be in the table as they are. Site Found Date should be a locality attribute with the format YYYY_MM_DD

Step two - all Site Found By don't need to be in the code table as they are. Site Found By values should be a locality attribute tied to Agents.

@dustymc Wanna start with these?

Sure, but I need the core structure first.

Core structure for locality attributes in general or for these two in particular? Any reason the structure should be different from the way we currently structure geology attributes?

does anyone see a reason NOT to build locality_attributes much like all other Attributes?

I don't.

The only weird thing going forward should be chronostratigraphy. All the other stuff can lose the hierarchy structure BUT as I said above, before we dismantle the current lithostratigraphy hierarchy, we should ensure that the terms are applied to all localities using any term below other terms in a current hierarchy.

I could help with this, but since I cannot see the current table, I can't go looking for stuff that needs a term added. Heck, I can't even give you an example of what needs to be done!

This needs priority - it is going to start causing problems when a single term needs to be added - never mind that those of us managing code tables can't even view the code table....

Those of you using these attributes, please comment. @Nicole-Ridgwell-NMMNHS @sharpphyl @dperriguey also @ArctosDB/arctos-code-table-administrators please respond.

In general.

Sameish structure, better names, much more dynamic supporting structure - there will be a datatype-defining code-table-code-table of some sort.

Yes I think the hierarchy can remain for now (or indefinitely) - we should talk before we invest too much in display and etc (maybe we'll dump it and pull from webservices, or exploit PG's capacity to deal with complex data objects, or something) but I don't think it's a big-picture stopper at this time; we can find something that'll work, even if it's not ideal.

Slightly bigger picture, the only important things from the data is that they can be munged into one of

  • free-text
  • number+unit
  • categorical

We've talked about adding is_ISO8601() or FKEY-->someTable as data at various times; that'd be cool, but it's also unexplored territory and would take some (perhaps significant) time to design - assuming there is a workable path - and I don't think we have the time to deal with that now. I'd suggest we put it off to a 2.0 implementation, but there are certainly efficiencies in tackling it all at once. There are probably-tolerable workarounds for all of that - eg, free-text with the DATE in determined_date or agent_id in determined_by.

If that all sounds reasonable then I think this is just priority, which is in https://github.com/ArctosDB/arctos/issues/2706.

This is absolutely going to mean a month-or-so where I can't easily patch production - are we stable enough for that yet?

BUT as I said above, before we dismantle the current lithostratigraphy hierarchy, we should ensure that the terms are applied to all localities using any term below other terms in a current hierarchy.

Is there a way we can automate or batch run this?

Going forward without the hierarchy structure, it is intuitive to add both formation and member, but people may not add group name. I say this because our curators routinely do not give me group information even though there is a field for it on our forms. There should perhaps be a warning on lithostratigraphy search that a hierarchy is not enforced.

Yes I think the hierarchy can remain for now (or indefinitely) - we should talk before we invest too much in display and etc (maybe we'll dump it and pull from webservices, or exploit PG's capacity to deal with complex data objects, or something)

Agree. More conversation is definitely needed.

This needs priority - it is going to start causing problems when a single term needs to be added

Yes, I need to enter new specimen records and cannot because I can't add new localities with attributes not in the table.

This is absolutely going to mean a month-or-so where I can't easily patch production - are we stable enough for that yet?

If not, can we at least get the current table fixed so we can add new attributes?

automate

Probably not, but that's mostly a social thing - UserA strongly disagrees with your hypothesis, finds it in their data anyway, then yells about Arctos...

batch run

Sure - as long as I can blame you, not Arctos!

get the current table fixed

It's not the table, its - something, probably server-guts, that I haven't been very successful in tuning nor finding someone who might help.

@Nicole-Ridgwell-NMMNHS Dusty added your terms this morning, let me know if they still aren't working.

Ah, I missed that, thank you!

Also - with regard to date found and site found by.

Suggest we merge these into "Site Discovery" (or better term if available) where
value = "site found"
determined date = found date
determiner = Agent who found the site.

The main issue with this presentation is discoverability - finding all sites discovered by Agent X isn't possible at this point because you can only search geology attributes by value - can we facilitate that somehow? At least in Locality search - probably not on the main search page.

merge these into "Site Discovery"

Yes!

can we facilitate

This issue will do that.

Suggest we merge these into "Site Discovery" (or better term if available) where
value = "site found"
determined date = found date
determiner = Agent who found the site.

I agree that this would be ideal if we can solve the discoverability issue.

That would also allow value = site re-found which would be useful for legacy localities rediscovered through old photos, journals, etc. Not talking about digital georeferencing here which aids the process, but like actual field re-discovery.

Sounds like a plan?

We need code table values for "Site Discovery"

site discovery - Information about who originally found a collecting site and when it was found.
site rediscovery - Information about who found a previously discovered collecting site and when the rediscovery occurred.

please comment on terms/definitions

site discovery
site rediscovery

Seems overly complicated, both hard to manage and hard to access. One happened before the other(s) - same as multiple sex determinations, identifications, etc. If that's not too crazy, maybe something less-discovery-like - "site description" perhaps? (I assume a fair number of these were "discovered" by some Native kid 500 years ago and you're just recording the entrance into the *ologist community/literature, no?)

Of course none of that's functional, you're completely welcome to do whatever you want, but looking at this from a larger or less-specialized perspective might be useful.

Yeah, the "discovery" thing bugs me too - but paleontologists use the "site found by" and "site found date" terms...

what about

site documentation - Information about who documented a collecting site and when the documentation occurred.

I am conflicted about the "re-discovery" thing. On the one hand, the succession of dates in the date determined field would tell you the order, on the other, a person searching may not easily interpret that, which could lead to an inaccurate reference in a publication.

I'm not at all tied to having "re-discovery" as a value, was just thinking out loud. It would make just as much sense to add another site discovery/found value if that is relevant.

AWG Issues Meeting: Go now.

@Jegelewicz @Nicole-Ridgwell-NMMNHS current attributes below. They all need categorized

  • free text
  • number + units (these need a units code table identified/created)
  • categorical (these need a value code table identified/created)

Tentatively suggest we move bed, formation etc. into individual one-value code tables.; that seems more reflective of the data.

Time can stay in geology_attribute_hierarchy for now, I think.

Can TRS be merged into some sort of coherent structure? Given the structure, the current won't survive additional opinions, is difficult to access and read, takes a lot of room, etc.

arctosprod@arctos>> select geology_attribute from geology_attributes group by geology_attribute order by geology_attribute;
      geology_attribute      
-----------------------------
 access
 bed
 biozone
 Eon/Eonothem
 Erathem/Era
 formation
 group
 Land Mammal Age
 Land Mammal Age Subinterval
 Land Vertebrate Faunachron
 member
 Series/Epoch
 Site Collector Number
 Site Field Number
 Site Found By
 Site Found Date
 Site Identifier
 Site Land Status
 Stage/Age
 Substage/Subage
 suite
 System/Period
 TRS aliquot
 TRS range
 TRS section
 TRS township
(26 rows)

access
bed
formation
group
member
Site Land Status
suite
Land Mammal Age
Land Mammal Age Subinterval
Land Vertebrate Faunachron
biozone

All of the above will be categorical, with the acceptable values equal to whatever currently is categorized as those terms. (I envision Locality Attribute Types equal to the terms above where anything in the current table with (member) following it will be an accepted value for the locality attribute "member" and so on.)

Site Collector Number
Site Field Number
Site Identifier

All of the above will be free text

Site Found By
Site Found Date

Will become the determiner and determined date for a new attribute = Site Documentation, categorical. The only value at this point would be "site documented" - determiner is the Agent who documented a collecting site and the determined date is the date when the documentation occurred.

TRS aliquot
TRS range
TRS section
TRS township

TRS Structure would ideally allow for selecting four categorical values from the TRS aliquot, TRS range, TRS section, and TRS township code tables with a single determiner and date. Possible? Some other way of getting the information to be consistent? @Nicole-Ridgwell-NMMNHS

Eon/Eonothem
Erathem/Era
Series/Epoch
Stage/Age
Substage/Subage
System/Period

All of the above will be categorical, with the acceptable values equal to whatever currently is categorized as those terms BUT these are the ones we want to maintain the hierarchy for. We can continue to use the method we have now for this, or we could assign begin and end dates to each term - I don't really care how we do it, but searching across these is important.

I'm beat - sorry if this is clear as mud....

Will become the determiner and determined date for a new attribute = Site Documentation, categorical. The only value at this point would be "site documented"

Can we have the value be "site found"?

TRS Structure would ideally allow for selecting four categorical values from the TRS aliquot, TRS range, TRS section, and TRS township code tables with a single determiner and date.

That would be ideal.

these are the ones we want to maintain the hierarchy for.

Yes

Also, Can we separate into separate attributes the standardized international chronostratigraphy and the regional/old chronostratigraphy? They are already separated in the current geology attributes.

Can we have the value be "site found"?

This should be moved to a new Issue - it doesn't affect the structure.

What sort of data do you want? "Site Documentation" would probably encourage more rare/thorough documentation, "site found" seems like something I should use as a "guestbook." ("It's still there, just like it was 32 minutes ago!")

TRS

I thought I'd recommended JSON objects and questioned usability up there somewhere, not sure if that was some other issue or I didn't click save or what. In any case that may be important to structure - there's no link between attributes in this model, so if you enter two TRS determinations (as 8 attributes) there's be no obvious way to put them back together.

separate attributes the standardized international chronostratigraphy and the regional/old chronostratigraphy

I don't think that would affect the model, but I thought we'd resolved to pull all of that from macrostrat?

Can we simplify the code tables and move any hierarchies/relationships to the definition?

That would make the data temporarily slightly less searchable - a search for "SomeParentTerm" would find only assertions of "SomeParentTerm" and would not find assertions of "SomeChildTerm." The pull and cache from macrostrat, which would revive the ability to search indirect assertions (without forcing us to manage it, or users to guess at our idiosyncrasies), could then be prioritized to (more than, I think - see below) compensate.

So instead of the mess of a weird table (which requires a fair bit of weird code in a bunch of places, and looks different to different users in all of those places, and currently melts the UI), we'd have "normal" code tables like....

ctgeology_group
SomeParentTerm--->def="bla bla whatever..."

ctformation
SomeChildTerm-->def="parent is SomeParentTerm Group, bla bla whatever...."

ctbed
SomeChildOfChildTerm-->def="parent is SomeChildTerm Formation-->SomeParentTerm Group, bla bla whatever...."

The "pull from macrostrat" could also be opened up to UI. Instead of structuring that as....

SomeLocalTerm-----SomeContextDataPulledFromMacrostrat

... it could be

term--moredata--source

which would allow...

SomeLocalTerm-----SomeContextData-----Macrostrat
SomeLocalTerm-----MoreContextData-----SomeUser
SomeLocalTerm-----EvenMoreContextData-----AnotherUser

and then an "include context" search for SomeContextData, MoreContextData, or EvenMoreContextData would all find SomeLocalTerm, and any or all of the context could display wherever SomeLocalTerm is displayd.

which I believe would effectively preserve all current functionality, along with allowing the integration of webservice data.

Please comment here ASAP if you have some reason to believe that this won't work, otherwise I'm going to proceed under the idea that I'll be talking to "normal" code tables.

@dustymc We don't want to use a hierarchy for lithostratigraphy (suite, group, formation, member, bed) because those things are not in the same hierarchy everywhere in the world.

We only want to apply the hierarchy to chronostratigraphy terms (Eon/Erathem, and such).

@Jegelewicz sorry about the cruddy examples, but I think the idea remains.

First, simple code tables, one for every categorical THING (eon, bed, whatever), same as always:

ctsomething
value=a_single_value; definition=define this, maybe mention some value in ctsomething_else

ctsomething_else
value=another_single_value; definition=define this, maybe mention some value in ctsomething

Second, something to link them

new_table_geology_cache_stuff
ctsomething.value[a_single_value] {has some searchable relationship with} ctsomething_else.value[another_single_value]

So it would work like this?
ctsystem_period
value=a_single_value; definition=define this, maybe mention some value in ctsomething_else
Neogene=https://www.wikidata.org/wiki/Q103924

ctera_ereathem
Cenozoic=https://www.wikidata.org/wiki/Q102416

Second, something to link them

new_table_geology_cache_stuff
Neogene IS PART OF Cenozoic (See relationship in Wikidata)

Note that Wikidata has these relationships built in
Part of
follows
followed by
has part

and also includes begin and end dates for the terms

We only want to apply the hierarchy to chronostratigraphy terms (Eon/Erathem, and such).

And for clarity, I thought you DID NOT want to assert those at all - they're implicit in other assertions (eg, lithostratigraphy), not something that's determined from specimens. Has that changed or did I misunderstand?

UGH I feel like we aren't communicating at all. Maybe we need to set up a call for all of this.

So it would work like this?

Yes.

Wikidata has these relationships built in

I think it's just RDF.

set up a call

Sure, say when.

@Nicole-Ridgwell-NMMNHS can we meet with Dusty at 10:30 MDT tomorrow (right after our mineral call)?

10:30 MDT tomorrow is fine with me.

Looking back over some of the older threads, I think we had agreed it would be nice if we could pull things from marostrat/have ranges, but never figured out the myriad complications.

And for clarity, I thought you DID NOT want to assert those at all - they're implicit in other assertions

This was mentioned when we were discussing trying to code chronostratigraphic ranges into the table and tying lithostratigraphy to chronostratigraphy. We're not ready to do something like that yet because there are too many complications. See https://github.com/ArctosDB/arctos/issues/2245

Some Regional chronostratigraphy may not be Macrostrat.

ranges

That's easy under locality attributes - two attributes ("minimum whatever" and "maximum whatever" or etc.) using the same value code table, and it's an interface problem from there. (And of course some future user will still have to guess at what you might have thought was in the middle - I am not and probably won't ever be advocating this, but it's still readily supported by the model.)

Some Regional chronostratigraphy may not be Macrostrat.

One option would be to manually add that (and whatever else you want that's neither directly measured from the specimen nor available via some webservice) to the cache model I mentioned above.

If this is something you're measuring then an attribute is appropriate, and you can still use the cache to manually add context.

I think my big-picture question is where those data come from; are they derived from something that might be in your collection (in which case they should be asserted as attributes), or inferred from something that's derived from something in the collection (in which case perhaps a less-constrained model is more appropriate)? That's not a critical distinction I suppose, but it seems like we've been mixing concepts here and this is a good time to straighten that out.

I am not and probably won't ever be advocating this

I agree but from a geology perspective because Lithostratigraphy is not by definition constrained by dates.

Sounds like I'm not fully understanding the cache model, I'll reread your above posts but maybe talking tomorrow will clarify.

My concern with ranges is what's in the middle, whatever the data/values. You say "A-D" and I have to do a bunch of research to know if C existed when you said that, and then I'm still guessing if you "believed in" C at the time (or were looking at a reference that did), if we're using the same sort, etc., etc., etc. It's ambiguous, any way you slice it, if the underlying data are the least bit dynamic (and I think you'll have a hard time finding any that aren't). If you say "A" and "B" and "C" and "D" then I don't have to guess what you mean because you told me.

If you say "A" and "B" and "C" and "D" then I don't have to guess what you mean because you told me.

So for lithostratigraphy, that is what would happen. There would be an attribute for all asserted lithostrata and we wouldn't expect anyone to infer stuff between what is asserted or find the objects in any search except for those asserted terms.

For chronostratigraphy, we would like to record what is asserted (Neogene), but if someone searches Cenozoic, they would also find everything asserted as Neogene because Neogene is part of Cenozoic.

My comment above only concerns expressing multiple values as ranges vs. listing the individual values.

what is asserted

That's a separate thing, but it also needs solved. Knowing a bit more about how it's asserted would be useful to me. Is that from someone holding a hunk of bone with a catalog number on it and saying "looks Neogene" or is it something much less direct?

Is that from someone holding a hunk of bone with a catalog number on it and saying "looks Neogene" or is it something much less direct?

It is generally based upon the rock in which the fossil was found. This is why it is a locality attribute rather than an object attribute. The site itself would be Neogene in age.

Knowing a bit more about how it's asserted would be useful to me.

Usually it is 1) we have determined from the rocks that this locality are so and so member of a particular formation then 2) based on published information we know that this member in this geographic area is from a particular chronostratigraphic interval.

Depending on how thorough the published information is, that assessment is more or less precise. For example, if the published info says this formation has been dated to 150 million years ago (Late Jurassic) in Utah, but you are looking at the formation in New Mexico, then you know, well maybe we're looking at an error range of a couple million years. If the published info says this bed of this member in New Mexico has been dated to 150.2 million years and you have stratigraphic columns that place your layer very near that layer, then you know that date is pretty precise for your locality.

Rarely, you may directly date the rock layers or fossils found at your locality.

litho: individual tables

"international_chronostratigraphy", all one table

term not null - type not null - begin_mya null - end_mya null

begin and end are decimals

timeslot serves as (semi)hierarchy, can be used to arrange, support interval queries, etc.

@Jegelewicz @Nicole-Ridgwell-NMMNHS I'm a little ahead of schedule (yay!?) and needing code tables sooner rather than later.

Anything that we can't resolve in the fairly near future will be migrated as free-text attributes, which will not be lossy and will be easy to transform to something else...until someone adds something inconsistent, at which point it would become yet another cleanup nightmare. I strongly favor getting the datatypes sorted out now if possible.

Here are existing "types"

 TRS township
 TRS aliquot
 Land Vertebrate Faunachron
 bed
 Site Found By
 access
 Site Land Status
 block
 Site Found Date
 Land Mammal Age Subinterval
 Series/Epoch
 biozone
 Eon/Eonothem

 Erathem/Era
 formation
 member
 TRS section
 Site Field Number
 group
 Era/Erathem
 suite
 Stage/Age
 System/Period
 field
 TRS range
 Site Identifier
 Substage/Subage
 Land Mammal Age
 Site Collector Number

And here's my waffly plan for them.

Series/Epoch
Eon/Eonothem
Erathem/Era
Era/Erathem
Stage/Age
System/Period
Substage/Subage

will go to...

arctosprod@arctosutf>> \d ctinternational_chronostratigraphy
                                        Table "public.ctinternational_chronostratigraphy"
   Column    |          Type          | Collation | Nullable |                              Default                              
-------------+------------------------+-----------+----------+-------------------------------------------------------------------
 icsid       | integer                |           | not null | nextval('ctinternational_chronostratigraphy_icsid_seq'::regclass)
 term_type   | character varying(60)  |           | not null | 
 term        | character varying(255) |           | not null | 
 begin_mya   | numeric                |           |          | 
 end_mya     | numeric                |           |          | 
 description | character varying      |           | not null | 

TRS township
TRS aliquot
TRS section
TRS range

are weird and I don't have any good suggestions. They should be in one data object if there's any expectation of multiple assertions or data will be lost (there'd be no way to tell which of the 12 TRS township values combine with which of the 12 TRS range values), and I can't see any good way to do that plus control them. Tentatively suggest a free-text concatenation; that's un-doable until someone adds something that doesn't fit the pattern (and we might be able to control the pattern - I'm not sure).

Land Vertebrate Faunachron
Land Mammal Age Subinterval
Land Mammal Age

are I believe your "regional chronostratigraphy."

  • I'm tempted to suggest free-text as well, but they could also be individual code tables - but that could lead to needing 456776 more code tables for things like 'semi-aquatic arboreal salamander age' and much whining from me.
  • Perhaps we should drop "international" from ctinternational_chronostratigraphy and shove them in there?
  • new code table "ctregional_chrono" with values like "Land Vertebrate Faunachron: some text" and "Land Vertebrate Faunachron: Some Different Text"??

Does "biozone" belong here as well?

bed
formation
member
group
suite

  • individual code tables (ctbed, ..., ctsuite) OR another weird more-than-value code table "lithostratigraphy" with "type" and "value" columns??

Does block belong here as well?
And "field"?

access: new code table ctlocality_access with one value 'private'

Site Land Status: new code table

Site Found By & Site Found Date: merged into free-text attribute "Locality Documentation" (or something of the sort that'll also work for non-paleo usage).

Site Field Number: free-text locality attribute

Site Identifier: variation of "Site Field Number"?

Site Collector Number: variation of "Site Field Number"?

Land Vertebrate Faunachron
Land Mammal Age Subinterval
Land Mammal Age
are I believe your "regional chronostratigraphy."

These are not regional chronostratigraphy, they're biochronology and we do need code tables for them.

We had put the regional chronostratigraphy as a separate list under chronostratigraphy. Their "types" are Series/Epoch, Eon/Eonothem, etc. You'll need to look at the old hierarchy to separate them out.

biozone is biostratigraphy and also needs a code table.

individual code tables (ctbed, ..., ctsuite) OR another weird more-than-value code table "lithostratigraphy" with "type" and "value" columns??

You're asking whether we want these similar to chronostratigraphy where we're linking them to million years ago values? I say no to that mostly because I think it would get too complicated.

Does block belong here as well? And "field"?

I don't know, I don't know what was put under those.

biochronology

OK, I'm leaning more towards this:

  • rename ctinternational_chronostratigraphy to ctchronostratigraphy (presumably anyone working in the field will have no problem recognizing the international terms??)
  • add new code table ctbiochronology (columns term and term_type)
  • add new code table ctlithostratigraphy (columns term and term_type)

That's more complicated code tables to manage, BUT it should also cover the bases and not result in a bunch of DB work when someone shows up with a petroleum collection or something equally specialized/weird with its own rock/time/life terminology - those would then just be values in a table.

No, I think we DO NOT want time in the two additional tables, BUT if we're already making weird tables then making them slightly weirder isn't a problem - if there's some ancillary data that should accompany those data, now's the time to give it a home, if there's not then both of them will be term-type-definition (vs. the "normal" term-definition columns).

rename ctinternational_chronostratigraphy to ctchronostratigraphy (presumably anyone working in the field will have no problem recognizing the international terms??)

I'd rather have them separate so that we can specify the international table is tightly controlled against the international standard vs. the 'regional' table which requires the unit be established in publications, but is not standard. have the two be separate attributes will also I think help encourage data users to also enter the international unit even if they have a regional unit. Additionally, if we are using dates to control the chronostratigraphy hierarchy, the 'regional' terms will not have dating as firmly established as the international terms.

if there's some ancillary data that should accompany those data

What comes to mind for me is a publication citation.

help encourage data users to

I'd think it would be the other way - it's one more THING to find in the dropdown, and we could end up with quite a few locality attributes.

Depending on how serious about "encourage" you are that could be a data check as well.

The actual requirements for entering an authority are the same across all code tables - someone with the access (and presumably training and community approval and etc., but that doesn't always happen) types it in.

I see two problems with multiple code tables.

  1. Usability. A user (both someone entering data and someone looking for specific data) who's found THING should not have to keep looking for some subset of THING in some other place. That of course hinges on the idea that "chronostratigraphy" is a THING, rather than "regional_chronostratigraphy" and "international_chronostratigraphy" being two distinct THINGs, which is probably a bit subjective.
  2. Sustainability. If there's an appropriate code table, then adding new values is a matter of data and can be done by any user with access. If there's not an appropriate code table, then adding new values is a matter of rebuilding a bunch of structure, and that has to be done by developers.

I'm not yet arguing for or against anything in particular, just trying to get all the considerations on the table.

not have dating as firmly

Those are NULLable - it doesn't have to be entered if it's not known.

publication citation

That should be part of the definition, here and everywhere else. EVERY definition would ideally look like "Short summary; see DOI:.... for the real scoop...."

publication citation

That should be part of the definition, here and everywhere else. EVERY definition would ideally look like "Short summary; see DOI:.... for the real scoop...."

I'm going to plug the idea that having the link out to something would be better in it's own spot and just like we do with higher geography, could be a check for faulty data. If there isn't a wikidata item, you can't add the term and if the wikidata item is already in use in the code table, you are probably trying to add a duplicate.

BTW for International Chron - this thing exists: https://vocabs.ardc.edu.au/repository/api/lda/csiro/international-chronostratigraphic-chart/geologic-time-scale-2020/resource?uri=http://resource.geosciml.org/classifier/ics/ischart/Burdigalian

TRS township
TRS aliquot
TRS section
TRS range

are weird and I don't have any good suggestions. They should be in one data object if there's any expectation of multiple assertions or data will be lost (there'd be no way to tell which of the 12 TRS township values combine with which of the 12 TRS range values), and I can't see any good way to do that plus control them. Tentatively suggest a free-text concatenation; that's un-doable until someone adds something that doesn't fit the pattern (and we might be able to control the pattern - I'm not sure).

Yes, they are weird and I really dislike them, but we are stuck with them. Ideally these would be a kind of linked attribute, where you had to select one thing for each (none could be NULL and only TRS aliquot would have the option of "unknown"). The fact is, the ones for NMMNH had to be reviewed and cleaned up, so they are fairly consistent - all the TRS info at UTEP is in a remark field and who knows where it is for other collections. No one has put up an issue trying to find everything in "T19N R01E sec03 N1/2". I suggest we concatenate those attributes for NMMNH localities and have the locality attribute be "Public Land Survey System Plat - https://en.wikipedia.org/wiki/Public_Land_Survey_System and allow free text.

@Nicole-Ridgwell-NMMNHS objections?

OK, I'm leaning more towards this:

rename ctinternational_chronostratigraphy to ctchronostratigraphy (presumably anyone working in the field will have no problem recognizing the international terms??)
add new code table ctbiochronology (columns term and term_type)
add new code table ctlithostratigraphy (columns term and term_type)

That's more complicated code tables to manage, BUT it should also cover the bases and not result in a bunch of DB work when someone shows up with a petroleum collection or something equally specialized/weird with its own rock/time/life terminology - those would then just be values in a table.

I would be on board with this. Although it would be nice for everyone to use the international chon terms, if the table were constructed this way then people would still find stuff that use only regional terms as long as we assign those regional terms dates.

the 'regional' terms will not have dating as firmly established as the international terms.
but they DO have dates, correct? Even if they are less managed, they exist and I think it would facilitate discovery to have these combined with the international strata. Nothing says you can't have more than one of these attributes assigned to a locality, so you could still pick an international and a regional.

Can we order the code table so the international terms appear before the regional ones to make them harder to get to?

And not to throw a wrench in things, but should biochronology have associated dates?

link out to something would be better in it's own spot

No real disagreement, but I think that's a major revision (eg, all code tables). In the meantime we could perhaps develop some sort of markup - [[http.....]] is a documentation link or something.

like we do with higher geography

The thing with HG is that wikipedia has data - I check stuff and throw up big red warnings if it seems like something weird is going on. (And we should do more - there's an Issue....). Whatever we do for all code tables would ideally support something similar. Your link is nice (https://vocabs.ardc.edu.au/repository/api/lda/csiro/international-chronostratigraphic-chart/geologic-time-scale-2020/resource.json?uri=http://resource.geosciml.org/classifier/ics/ischart/Neogene is nicer) but writing code to hundreds of different sources would be a full-time job or three.

Public Land Survey System Plat

I'll run with that unless someone has a better idea in the very near future.

still find stuff ... dates.

I'd call that another reason to unify - I can't really imagine what doing that across tables might look like.

Can we order the code table

Sorta-sometimes, in the local UI, if there's something non-magical to sort by. In some API-based app, not so much.

Your link is nice (https://vocabs.ardc.edu.au/repository/api/lda/csiro/international-chronostratigraphic-chart/geologic-time-scale-2020/resource.json?uri=http://resource.geosciml.org/classifier/ics/ischart/Neogene is nicer)

Actually, I was wondering if we couldn't use their API for our code table values somehow.

The thing with HG is that wikipedia has data - I check stuff and throw up big red warnings if it seems like something weird is going on. (And we should do more - there's an Issue....)

So does Wikidata - and it is more structured. But wikipedia could work for the chronostratigraphy too - there are pages for all those things.

Can we order the code table

Sorta-sometimes, in the local UI, if there's something non-magical to sort by. In some API-based app, not so much.

Yeah, I am mostly concerned with picks during data entry and search, so local would be good enough for me. @Nicole-Ridgwell-NMMNHS needs to weigh in. Also once we have something to look at in test, I'd like to bring in a few others before we pull triggers.

Also once we have something to look at in test, I'd like to bring in a few others before we pull triggers.

There's some functional stuff (http://test.arctos.database.museum/guid/UAMb:Herb:52204), but this is getting a little chicken-or-eggy - I can't move data in until I know how to, you can't see things work until I move stuff. Is there something in particular you want to see?

For the sort, I'm thinking we should add an explicit sort_order column, especially if we're merging international and "local" and you want THIS "1-10" separated from THAT "1-10." I've got quite a bit of time in that table - mostly avoiding the non-normal layout - and I don't want to re-do that too many more times. How can we get this sorted out - another call?

I'm going to plug the idea that having the link out to something would be better in it's own spot and just like we do with higher geography, could be a check for faulty data. If there isn't a wikidata item, you can't add the term and if the wikidata item is already in use in the code table, you are probably trying to add a duplicate.

I wasn't thinking of wikidata. I was thinking an actual scientific paper where the lithostratigraphic unit is described like what can be done with taxonomy.

No one has put up an issue trying to find everything in "T19N R01E sec03 N1/2".

Actually the very first thing I every posted on Github about TRS was a request that it be entered in a searchable format. I frequently have to search for localities by TRS and having this data concatenated would make that very difficult.

as we assign those regional terms dates

As I said before this is difficult if not impossible. It would require a deep search through literature (for which access may be difficult) for data which may or may not exist, and even if it does exist its accuracy may be debatable because it hasn't been established in the same way that the international chronostratigraphy has (i.e. the 'golden spike' system that ties the rocks to the geochronology).

I still don't like the idea of throwing the regional terms in with the standardized international ones. For the regional terms, a subset of geologists don't even think we should use them anymore. Throwing them in with the international terms gives the regional terms a sense of validity that may or may not exist.

Depending on how serious about "encourage" you are that could be a data check as well.

Not a bad idea if we can somehow just do this for localities with geology.

should biochronology have associated dates

We could do this for North American Land Mammal Ages, but probably nothing else.

I wasn't thinking of wikidata. I was thinking an actual scientific paper where the lithostratigraphic unit is described like what can be done with taxonomy.

We can add that scientific paper to the Wikidata item, making it available to the WORLD and not just Arctos. I expect to propose this for taxonomy eventually....

Actually the very first thing I every posted on Github about TRS was a request that it be entered in a searchable format. I frequently have to search for localities by TRS and having this data concatenated would make that very difficult.

Fair. So I suggest we just keep the four code tables. The "assertion", should there be more than one, could be put together by the determiner and date associated with each attribute IF anyone completes those fields...

I still don't like the idea of throwing the regional terms in with the standardized international ones. For the regional terms, a subset of geologists don't even think we should use them anymore. Throwing them in with the international terms gives the regional terms a sense of validity that may or may not exist.

With regard to "subset of geologists don't even think we should use them anymore", this is so much like taxonomy (ugh). Should we even support their use? Maybe Arctos policy should be that any regional term be placed in the remark of the appropriate international term. If we don't want to be that strict, then maybe we should call this list "other stratigraphy" and move on? But we should warn everyone that these are in no way associated with the ICS terms that can do cool things in search.....

Depending on how serious about "encourage" you are that could be a data check as well.

Not a bad idea if we can somehow just do this for localities with geology.

This could be part of "Low Quality Data" - any locality with an "other stratigraphy" attribute, but no "ctchronostratigraphy" attribute gets flagged as low quality. Not perfect, but might "encourage" people to add the appropriate international term...

should biochronology have associated dates

We could do this for North American Land Mammal Ages, but probably nothing else.

OK, let's go with Dusty's plan for this table then, no dates.

be placed in the remark of the appropriate international term.

But that decreases searchability, introduces misspelling possibilities. The fact is that even though there is dispute about the use of regional terms, in certain regions (including areas where our curators collect) these terms are used pretty heavily, usually because the international units don't map onto the local stratigraphy very well.

This could be part of "Low Quality Data" - any locality with an "other stratigraphy" attribute, but no "ctchronostratigraphy" attribute gets flagged as low quality. Not perfect, but might "encourage" people to add the appropriate international term...

Sounds good to me.

Fair. So I suggest we just keep the four code tables. The "assertion", should there be more than one, could be put together by the determiner and date associated with each attribute IF anyone completes those fields...

Not the best solution, but more searchable than a concatenated field. Maybe eventually we could get these connected?

actual scientific paper where the lithostratigraphic unit is described like what can be done with taxonomy.

Can that be "phase two"? Arctos is primarily a specimen database; adding formal authorities to our authorities isn't something we've done before, and isn't something we should or can approach from here.

TRS...searchable

I'll need details; what do you want to search for? Is that actually supported by 4 (probably sometimes...) independent text objects?? I am less than enthusiastic about replacing our broken-because huge code table with a slightly less huge code table (there's no limit on the number of aliquots, I believe), and the idea of a "temporary" code table is terrifying. Unless there are STRONG objections I'll proceed with 4 free-text attributes and a new issue that can be prioritized. (I suspect we'll end up finding or building a datatype - wonder if postgis has anything?)

  • any locality with an "other stratigraphy" attribute, but no "ctchronostratigraphy" attribute gets flagged

Issue please; I don't think that's a problem, and it's more robust+easier to write than what I had in mind.

And from there, how about three overly-complex code tables:

  • int-strat
  • bio
  • litho

and a periodic report on the free-text "non-authoritative chronostratigraphy" (or whatever you want to call it) attribute? Seems like that would avoid the problems and still do what you need, and if it's not working we can revisit later.

so much like taxonomy

I'm not sure - from here it looks like there is a comprehensive international nomenclature (so maybe like virus taxonomy), some just refuse to use it. If that's remotely correct, then this is much more like identifications than taxonomy and so I'm suggesting an identification-like structure for it.

Can that be "phase two"?

I'm fine with that.

what do you want to search for?

Usually I'm asked something like - find all the localities in this Township/Range + list of sections. If you are worried about code tables getting unwieldy, it would be ok to do aliquots as a text field.

aliquots as a text field.

That seems workable. I think we still need a datatype, but maybe that can make it lower-priority.

Here's my concern with a single litho attribute - the code table will be HUGE and we will end up with confusion when we have things with the same "name" Chinle Group and Chinle Formation (don't know if those are real, just too lazy to look up an actual example).

Not that we can't deal with that, but maybe we should consider simplifying and just having separate attributes for:

Lithostratigraphic group
Lithostratigraphic formation
Lithostratigraphic member
Lithostratigraphic bed
Lithostratigraphic flow

These kinds of lithostrata are defined by the ICS. Any other terms (suite, etc) go in Lithostratigraphic other and we could treat these as low quality data if no other litho term is applied to the locality?

Also noticed this - could be of use with regard to the bio stuff which might also end up being a huge table.

Just want to make sure we have looked at all options...

simplifying and just having separate attributes

Yay: it's a huge simplification, those would be "normal" code tables (right??)

Maybe-not-so-yay: it's not clear to me that there's a limit on those things - eg do the petroleum-or-something people think things not included in those 5 are low quality?

what are

group
suite
block
field

?

Even 10 litho-tables seems workable - is that a semi-realistic cap?? If so I'm all in.

Below is what I was about to insert into ctinternational_chronostratigraphy. There are currently 168 rows, and who knows if it's complete. That does not make a very usable dropdown!

I'm (back to?) thinking that multiple tables might make more sense here, and I'll just have to figure out how to systematically deal with more-complex code table.

????????????????????????

arctosprod@arctos>> select 
arctos-> attribute,
arctos-> attribute_value,
arctos-> description
arctos-> from
arctos-> geology_attribute_hierarchy
arctos-> where
arctos-> attribute in (
arctos(> 'Series/Epoch',
arctos(> 'Eon/Eonothem',
arctos(> 'Erathem/Era',
arctos(> 'Stage/Age',
arctos(> 'System/Period',
arctos(> 'Substage/Subage'
arctos(> );
    attribute    |   attribute_value    |                                                                                        description                                                                                        
-----------------+----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 System/Period   | Quaternary           | <a target="_blank" class="external" href="https://en.wikipedia.org/wiki/Quaternary">Wikipedia</a>
 Erathem/Era     | Cenozoic             | https://en.wikipedia.org/wiki/Cenozoic
 System/Period   | Neogene              | <a target="_blank" class="external" href="https://en.wikipedia.org/wiki/Neogene">Wikipedia</a>
 Series/Epoch    | Miocene              | https://en.wikipedia.org/wiki/Miocene
 Series/Epoch    | Pliocene             | https://en.wikipedia.org/wiki/Pliocene
 Series/Epoch    | Holocene             | https://en.wikipedia.org/wiki/Holocene
 System/Period   | Cambrian             | https://en.wikipedia.org/wiki/Cambrian
 Stage/Age       | Calabrian            | https://en.wikipedia.org/wiki/Calabrian_(stage)
 Stage/Age       | Ionian               | pending
 Stage/Age       | Upper Pleistocene    | https://en.wikipedia.org/wiki/Late_Pleistocene
 Series/Epoch    | Pleistocene          | https://en.wikipedia.org/wiki/Pleistocene
 Series/Epoch    | Mississippian        | https://en.wikipedia.org/wiki/Mississippian_(geology)
 Series/Epoch    | Upper Pennsylvanian  | pending
 System/Period   | Cretaceous           | https://en.wikipedia.org/wiki/Cretaceous
 System/Period   | Jurassic             | https://en.wikipedia.org/wiki/Jurassic
 System/Period   | Triassic             | https://en.wikipedia.org/wiki/Triassic
 Erathem/Era     | Paleozoic            | https://en.wikipedia.org/wiki/Paleozoic
 System/Period   | Silurian             | https://en.wikipedia.org/wiki/Silurian
 System/Period   | Ordovician           | https://en.wikipedia.org/wiki/Ordovician
 Series/Epoch    | Upper Cretaceous     | https://en.wikipedia.org/wiki/Late_Cretaceous
 Series/Epoch    | Lower Cretaceous     | https://en.wikipedia.org/wiki/Early_Cretaceous
 Series/Epoch    | Pennsylvanian        | https://en.wikipedia.org/wiki/Pennsylvanian_(geology)
 Series/Epoch    | Middle Pennsylvanian | pending
 Series/Epoch    | Lower Pennsylvanian  | pending
 Series/Epoch    | Upper Mississippian  | pending
 Series/Epoch    | Middle Mississippian | pending
 Series/Epoch    | Lower Mississippian  | pending
 Series/Epoch    | Upper Devonian       | https://en.wikipedia.org/wiki/Devonian#subdivisions
 Series/Epoch    | Middle Devonian      | https://en.wikipedia.org/wiki/Devonian#subdivisions
 Series/Epoch    | Lower Devonian       | https://en.wikipedia.org/wiki/Devonian#subdivisions
 Erathem/Era     | Mesozoic             | https://en.wikipedia.org/wiki/Mesozoic
 System/Period   | Permian              | https://en.wikipedia.org/wiki/Permian
 System/Period   | Carboniferous        | https://en.wikipedia.org/wiki/Carboniferous
 System/Period   | Devonian             | https://en.wikipedia.org/wiki/Devonian
 Series/Epoch    | Upper Jurassic       | https://en.wikipedia.org/wiki/Late_Jurassic
 Series/Epoch    | Middle Jurassic      | https://en.wikipedia.org/wiki/Middle_Jurassic
 Series/Epoch    | Lower Jurasssic      | https://en.wikipedia.org/wiki/Early_Jurassic
 Series/Epoch    | Upper Triassic       | https://en.wikipedia.org/wiki/Late_Triassic
 Series/Epoch    | Middle Triassic      | https://en.wikipedia.org/wiki/Middle_Triassic
 Series/Epoch    | Lower Triassic       | https://en.wikipedia.org/wiki/Early_Triassic
 Stage/Age       | Maastrichtian        | https://en.wikipedia.org/wiki/Maastrichtian
 Stage/Age       | Barremian            | https://en.wikipedia.org/wiki/Barremian
 Stage/Age       | Valanginian          | https://en.wikipedia.org/wiki/Valanginian
 Stage/Age       | Berriasian           | https://en.wikipedia.org/wiki/Berriasian
 System/Period   | Paleogene            | https://en.wikipedia.org/wiki/Paleogene
 Stage/Age       | Tithonian            | https://en.wikipedia.org/wiki/Tithonian
 Series/Epoch    | Paleocene            | https://en.wikipedia.org/wiki/Paleocene
 Series/Epoch    | Eocene               | https://en.wikipedia.org/wiki/Eocene
 Stage/Age       | Oxfordian            | https://en.wikipedia.org/wiki/Oxfordian_(stage)
 Stage/Age       | Bathonian            | https://en.wikipedia.org/wiki/Bathonian
 Stage/Age       | Hettangian           | https://en.wikipedia.org/wiki/Hettangian
 Stage/Age       | Rhaetian             | https://en.wikipedia.org/wiki/Rhaetian
 Stage/Age       | Norian               | https://en.wikipedia.org/wiki/Norian
 Stage/Age       | Carnian              | https://en.wikipedia.org/wiki/Carnian
 Stage/Age       | Ladinian             | https://en.wikipedia.org/wiki/Ladinian
 Stage/Age       | Santonian            | https://en.wikipedia.org/wiki/Santonian
 Stage/Age       | Turonian             | https://en.wikipedia.org/wiki/Turonian
 Eon/Eonothem    | Phanerozoic          | <a target="_blank" class="external" href="https://en.wikipedia.org/wiki/Phanerozoic">Wikipedia</a>
 Eon/Eonothem    | Proterozoic          | https://en.wikipedia.org/wiki/Proterozoic
 Eon/Eonothem    | Archean              | https://en.wikipedia.org/wiki/Archean
 Stage/Age       | Campanian            | https://en.wikipedia.org/wiki/Campanian
 Stage/Age       | Albian               | https://en.wikipedia.org/wiki/Albian
 Stage/Age       | Hauterivian          | https://en.wikipedia.org/wiki/Hauterivian
 Series/Epoch    | Oligocene            | https://en.wikipedia.org/wiki/Oligocene
 Stage/Age       | Bajocian             | https://en.wikipedia.org/wiki/Bajocian
 Stage/Age       | Pliensbachian        | https://en.wikipedia.org/wiki/Pliensbachian
 Stage/Age       | Sinemurian           | https://en.wikipedia.org/wiki/Sinemurian
 Stage/Age       | Anisian              | https://en.wikipedia.org/wiki/Anisian
 Stage/Age       | Olenekian            | https://en.wikipedia.org/wiki/Olenekian
 Stage/Age       | Induan               | https://en.wikipedia.org/wiki/Induan
 Eon/Eonothem    | Precambrian          | pending
 Stage/Age       | Coniacian            | https://en.wikipedia.org/wiki/Coniacian
 Stage/Age       | Cenomanian           | https://en.wikipedia.org/wiki/Cenomanian
 Stage/Age       | Aptian               | https://en.wikipedia.org/wiki/Aptian
 Stage/Age       | Callovian            | https://en.wikipedia.org/wiki/Callovian
 Stage/Age       | Toarcian             | https://en.wikipedia.org/wiki/Toarcian
 Stage/Age       | Kimmeridgian         | https://en.wikipedia.org/wiki/Kimmeridgian
 Stage/Age       | Aalenian             | https://en.wikipedia.org/wiki/Aalenian
 Series/Epoch    | Upper Ordovician     | https://en.wikipedia.org/wiki/Ordovician#Subdivisions
 Series/Epoch    | Middle Ordovician    | https://en.wikipedia.org/wiki/Ordovician#Subdivisions
 Series/Epoch    | Lower Ordovician     | https://en.wikipedia.org/wiki/Ordovician#Subdivisions
 Series/Epoch    | Lopingian            | https://en.wikipedia.org/wiki/Lopingian
 Series/Epoch    | Guadalupian          | https://en.wikipedia.org/wiki/Guadalupian
 Series/Epoch    | Cisuralian           | https://en.wikipedia.org/wiki/Cisuralian
 Series/Epoch    | Ludlow               | https://en.wikipedia.org/wiki/Ludlow_epoch
 Stage/Age       | Chesterian           | A subdivision of the Mississippian in the North American system. Includes the top of the Visean plus the Serpukhovian. https://en.wikipedia.org/wiki/Mississippian_(geology)#Subdivisions
 Stage/Age       | Rupelian             | https://en.wikipedia.org/wiki/Rupelian
 Series/Epoch    | Famennian            | https://en.wikipedia.org/wiki/Famennian
 Stage/Age       | Chattian             | https://en.wikipedia.org/wiki/Chattian
 Stage/Age       | Serpukhovian         | https://en.wikipedia.org/wiki/Serpukhovian
 Series/Epoch    | Pridoli              | https://en.wikipedia.org/wiki/Pridoli_epoch
 Stage/Age       | Emsian               | https://en.wikipedia.org/wiki/Emsian
 Stage/Age       | Danian               | https://en.wikipedia.org/wiki/Danian
 Stage/Age       | Ypresian             | https://en.wikipedia.org/wiki/Ypresian
 Stage/Age       | Bashkirian           | https://en.wikipedia.org/wiki/Bashkirian
 Stage/Age       | Morrowan             | pending
 Stage/Age       | Frasnian             | https://en.wikipedia.org/wiki/Frasnian
 Series/Epoch    | Lower Cambrian       | pending
 Stage/Age       | Bartonian            | https://en.wikipedia.org/wiki/Bartonian
 Stage/Age       | Priabonian           | https://en.wikipedia.org/wiki/Priabonian
 Stage/Age       | Lutetian             | https://en.wikipedia.org/wiki/Lutetian
 Stage/Age       | Aquitanian           | https://macrostrat.org/sift/#/interval/19 and https://en.wikipedia.org/wiki/Aquitanian_(stage)
 Stage/Age       | Artinskian           | https://macrostrat.org/sift/#/interval/81 and https://en.wikipedia.org/wiki/Artinskian
 Stage/Age       | Kungurian            | https://macrostrat.org/sift/#/interval/79 and https://en.wikipedia.org/wiki/Kungurian
 Stage/Age       | Burdigalian          | https://en.wikipedia.org/wiki/Burdigalian
 Stage/Age       | Eifelian             | https://macrostrat.org/sift/#/interval/100 and https://en.wikipedia.org/wiki/Eifelian
 Stage/Age       | Givetian             | https://en.wikipedia.org/wiki/Givetian
 Stage/Age       | Piacenzian           | https://macrostrat.org/sift/#/interval/11 and https://en.wikipedia.org/wiki/Piacenzian
 Stage/Age       | Roadian              | https://macrostrat.org/sift/#/interval/226 and https://en.wikipedia.org/wiki/Roadian
 Stage/Age       | Serravallian         | https://macrostrat.org/sift/#/interval/16 and https://en.wikipedia.org/wiki/Serravallian
 Stage/Age       | Langhian             | https://en.wikipedia.org/wiki/Langhian
 Stage/Age       | Thanetian            | https://en.wikipedia.org/wiki/Thanetian
 Stage/Age       | Changhsingian        | https://en.wikipedia.org/wiki/Changhsingian
 Stage/Age       | Sakmarian            | https://en.wikipedia.org/wiki/Sakmarian
 Stage/Age       | Asselian             | https://en.wikipedia.org/wiki/Asselian
 Stage/Age       | Moscovian            | https://en.wikipedia.org/wiki/Moscovian_(Carboniferous)
 Stage/Age       | Visean               | https://en.wikipedia.org/wiki/Vis%C3%A9an
 Stage/Age       | Tournaisian          | https://en.wikipedia.org/wiki/Tournaisian
 Stage/Age       | Pragian              | https://en.wikipedia.org/wiki/Pragian
 Stage/Age       | Lochkovian           | https://en.wikipedia.org/wiki/Lochkovian
 Stage/Age       | Sheinwoodian         | https://en.wikipedia.org/wiki/Sheinwoodian
 Series/Epoch    | Llandovery           | https://en.wikipedia.org/wiki/Llandovery_epoch
 Stage/Age       | Telychian            | https://en.wikipedia.org/wiki/Telychian
 Stage/Age       | Hirnantian           | https://en.wikipedia.org/wiki/Hirnantian
 Stage/Age       | Katian               | https://en.wikipedia.org/wiki/Katian
 Stage/Age       | Sandbian             | https://en.wikipedia.org/wiki/Sandbian
 Stage/Age       | Darriwilian          | https://en.wikipedia.org/wiki/Darriwilian
 Stage/Age       | Dapingian            | https://en.wikipedia.org/wiki/Dapingian
 Stage/Age       | Tremadocian          | https://en.wikipedia.org/wiki/Tremadocian
 Stage/Age       | Cambrian Stage 10    | https://en.wikipedia.org/wiki/Cambrian_Stage_10
 Stage/Age       | Jiangshanian         | https://en.wikipedia.org/wiki/Jiangshanian
 Series/Epoch    | Miaolingian          | https://en.wikipedia.org/wiki/Miaolingian
 Stage/Age       | Cambrian Stage 3     | https://en.wikipedia.org/wiki/Cambrian_Stage_3
 Stage/Age       | Selandian            | https://macrostrat.org/sift/#/interval/273 and https://en.wikipedia.org/wiki/Selandian
 Stage/Age       | Tortonian            | https://macrostrat.org/sift/#/interval/15 and https://en.wikipedia.org/wiki/Tortonian
 Stage/Age       | Wordian              | https://macrostrat.org/sift/#/interval/225 and https://en.wikipedia.org/wiki/Wordian
 Stage/Age       | Meghalayan           | https://en.wikipedia.org/wiki/Meghalayan
 Stage/Age       | Greenlandian         | https://en.wikipedia.org/wiki/Greenlandian
 Stage/Age       | Wuchiapingian        | https://en.wikipedia.org/wiki/Wuchiapingian
 Stage/Age       | Gzhelian             | https://en.wikipedia.org/wiki/Gzhelian
 Stage/Age       | Kasimovian           | https://en.wikipedia.org/wiki/Kasimovian
 Stage/Age       | Ludfordian           | https://en.wikipedia.org/wiki/Ludfordian
 Stage/Age       | Gorstian             | https://en.wikipedia.org/wiki/Gorstian
 Series/Epoch    | Wenlock              | https://en.wikipedia.org/wiki/Wenlock_epoch
 Stage/Age       | Homerian             | https://en.wikipedia.org/wiki/Homerian
 Stage/Age       | Aeronian             | https://en.wikipedia.org/wiki/Aeronian
 Stage/Age       | Rhuddanian           | https://en.wikipedia.org/wiki/Rhuddanian
 Stage/Age       | Drumian              | https://en.wikipedia.org/wiki/Drumian
 Stage/Age       | Virgilian            | https://en.wikipedia.org/wiki/Virgilian_series
 Stage/Age       | Capitanian           | https://macrostrat.org/sift/#/interval/224 and https://en.wikipedia.org/wiki/Capitanian
 Stage/Age       | Zanclean             | https://macrostrat.org/sift/#/interval/12 and https://en.wikipedia.org/wiki/Zanclean
 Stage/Age       | Northgrippian        | https://en.wikipedia.org/wiki/Northgrippian
 Stage/Age       | Gelasian             | https://en.wikipedia.org/wiki/Gelasian
 Stage/Age       | Messinian            | https://en.wikipedia.org/wiki/Messinian
 Stage/Age       | Floian               | https://en.wikipedia.org/wiki/Floian
 Series/Epoch    | Furongian            | https://en.wikipedia.org/wiki/Furongian
 Stage/Age       | Paibian              | https://en.wikipedia.org/wiki/Paibian
 Stage/Age       | Guzhangian           | https://en.wikipedia.org/wiki/Guzhangian
 Stage/Age       | Wuliuan              | https://en.wikipedia.org/wiki/Wuliuan
 Series/Epoch    | Cambrian Series 2    | https://en.wikipedia.org/wiki/Cambrian_Series_2
 Stage/Age       | Cambrian Stage 4     | https://en.wikipedia.org/wiki/Cambrian_Stage_4
 Series/Epoch    | Terreneuvian         | https://en.wikipedia.org/wiki/Terreneuvian
 Stage/Age       | Cambrian Stage 2     | https://en.wikipedia.org/wiki/Cambrian_Stage_2
 Stage/Age       | Fortunian            | https://en.wikipedia.org/wiki/Fortunian
 System/Period   | Stenian              | https://en.wikipedia.org/wiki/Stenian
 Substage/Subage | Smithian             | The Olenekian is sometimes divided into the Smithian and the Spathian subages or substages; https://en.wikipedia.org/wiki/Olenekian
 Series/Epoch    | Leonardian           | https://ngmdb.usgs.gov/Geolex/UnitRefs/LeonardianRefs_11856.html
 Series/Epoch    | Missourian           | https://ilstratwiki.web.illinois.edu/index.php/Missourian_Series
(168 rows)

From http://quaternary.stratigraphy.org/stratigraphic-guide/lithodemic-stratigraphy/ Suite is basically the equivalent of group for lithodemic rock units.

Bodies of rock which do not conform to the Law of Superposition are described as lithodemic. They are generally composed of intrusive, highly deformed or metamorphic rocks, determined and delimited on the basis of rock characteristics. Their boundaries may be sedimentary, intrusive, extrusive, tectonic or metamorphic. A formal classification of lithodemic units was presented in the 1983 North American Stratigraphic Code, comprising a lithodeme, which is comparable to a formation, a suite, which is roughly equivalent to a group, and a supersuite, comparable with a supergroup.

Lithodemic units are not included in the 1994 International Stratigraphic Guide, which instead regards intrusive igneous bodies and non-layered metamorphic rocks of undetermined origin as special cases within lithostratigraphy. The guide advises against using the term suite. The term complex (see below) is used loosely, a term that is less likely to be of use in Quaternary geology. Here it is recommended that the procedures advocated in the North American Stratigraphic Code are followed, with the modifications noted below.

What are the names of units in as block and field?

lithodemic rock units.

That gets us up to 10 "rock unit" code tables - are there more?

units in as block and field

Neither are used so can probably be ignored for now.

lithodemic doesn't need its own, I was just saying that series is part of this separate lithodemic sysem of rock unit names.

So we make code tables for these lithostratigraphic units:
group
formation
member
bed
flow
suite
other

There might be others that come up if we ever get a big geology collection with a lot of non-sedimentary samples, but for now I think just these are fine.

lithodemic doesn't need its own,

It does if we - you edited, that! So not one for lithodemic, but one for each element - one for Supercomplex and one for Lithodeme and ....

I know I'm flip-flopping all over the place, but right now I'm leaning towards:

  • what you said
  • ditto for chrono - ctSeries_Epoch, ctEon_Eonothem, etc.
  • ditto for bio
  • ditto for that thing we'll get in tomorrow

so we'll end up with a bunch of simple (value and definition columns) code tables, each with a hopefully-usable number of values in them.

Then Phase Two (new issue, can be prioritized) we'll add some sort of "code table metadata" which we can use to assert (and search) stuff we drag in from various webservices, ages or whatever else ya'll want to add - anything that isn't precisely "specimen data" but can be derived from "specimen data" and is somehow useful to have around. We can also reevaluate the big picture at that point - this should have some usage by then, maybe we'll want to revive ctgiant_complicated_category-type-thing or something.

Reasonable????????

What happened to adding begin and end in https://github.com/ArctosDB/arctos/issues/2498#issuecomment-656293691

I think it would be fine to split the table into

Series/Epoch
Eon/Eonothem
Erathem/Era
Era/Erathem
Stage/Age
System/Period
Substage/Subage

but I guess we still didn't communicate because the age values belong in the code table, not the assertion.

so

ctdsystem_period would be structured like this
Attribute value | Begin | End | Description
-- | -- | -- | --
Neogene | 23.03 | 2.58 | https://en.wikipedia.org/wiki/Neogene

ctdseries_epoch would be structured like this
Attribute value | Begin | End | Description
-- | -- | -- | --
Pliocene | 5.333 | 2.58 | https://en.wikipedia.org/wiki/Pliocene

Those dates are what you use to search across chronostrata (the begin and end dates of Pliocene are within or equal to the begin and end dates of Neogene, therefore someone who searches Neogene, would also get everything that is recorded as Pliocene). Realizing that the math is a bit backward, because these are really negative dates...

because the age values belong in the code table, not the assertion.

On what basis do you say that? What I've proposed is functionally identical for a user - searching for "3" will find Pliocene in either case. I think NOT having them in the CT is probably more structurally representative of the data, it would make management easier, and it would make maintaining the code significantly easier.

On another topic, I don't like "Site Land Status." The Proper Case is not consistent with other CTs, and locality is explicitly about "land" ("place" anyway). How's "site status"? "land ownership" might be more clear but I'm not sure it's correct.

This would be a good time to clean up the terminology as well - I'll make a proposal in another comment.

arctosprod@arctos>> select geo_att_value,'documentation needed' from geology_attributes where geology_attribute='Site Land Status' group by geo_att_value;
         geo_att_value         |       ?column?       
-------------------------------+----------------------
 City                          | documentation needed
 USBOR                         | documentation needed
 NPS                           | documentation needed
 USACE                         | documentation needed
 USFWS                         | documentation needed
 Tribal                        | documentation needed
 NWR                           | documentation needed
 DOE                           | documentation needed
 Public                        | documentation needed
 County                        | documentation needed
 unknown                       | documentation needed
 Private                       | documentation needed
 DOD                           | documentation needed
 BLM                           | documentation needed
 Bankhead-Jones Land Use Lands | documentation needed
 USFS                          | documentation needed
 State                         | documentation needed
 Land Grant                    | documentation needed

I'd like to bring "Site Land Status" over as "site status", then run the code below, then build the code table from the cleaned values. Yay!?

update locality_attributes set attribute_value='city' where attribute_type='site status' and attribute_value='City';
update locality_attributes set attribute_value='US National Park Service' where attribute_type='site status' and attribute_value='NPS';
update locality_attributes set attribute_value='US Fish and Wildlife Service' where attribute_type='site status' and attribute_value='USFWS';
update locality_attributes set attribute_value='tribal' where attribute_type='site status' and attribute_value='Tribal';
update locality_attributes set attribute_value='public' where attribute_type='site status' and attribute_value='Public';
update locality_attributes set attribute_value='county' where attribute_type='site status' and attribute_value='County';
update locality_attributes set attribute_value='private' where attribute_type='site status' and attribute_value='Private';
update locality_attributes set attribute_value='US Bureau of Land Management' where attribute_type='site status' and attribute_value='BLM';
update locality_attributes set attribute_value='US Department of Defense' where attribute_type='site status' and attribute_value='DOD';
update locality_attributes set attribute_value='US Forest Service' where attribute_type='site status' and attribute_value='USFS';
update locality_attributes set attribute_value='state' where attribute_type='site status' and attribute_value='State';
update locality_attributes set attribute_value='US Bureau of Reclamation' where attribute_type='site status' and attribute_value='USBOR';
update locality_attributes set attribute_value='US Army Corps of Engineers' where attribute_type='site status' and attribute_value='USACE';
update locality_attributes set attribute_value='US National Wildlife Refuge System' where attribute_type='site status' and attribute_value='NWR';
update locality_attributes set attribute_value='US Department of Energy' where attribute_type='site status' and attribute_value='DOE';
update locality_attributes set attribute_value='land grant' where attribute_type='site status' and attribute_value='Land Grant';

So we are going to have to enter the dates every time we create a stage/age attribute?

BTW a bunch of the stuff in https://github.com/ArctosDB/arctos/issues/2498#issuecomment-656293691 are regional terms... the way to find them is they have a relationship of child to "Regional Chronostratigraphy" or something like that - not sure I have that parent term completely correct.

So we are going to have to enter the dates every time we create a stage/age attribute?

No, what I've proposed is functionally identical from there as well. Enter "bla stage-age" and the numeric-bits follow automagically.

You'll need to enter those in a new "code table" (or something) when you create a new authority, so it'll be one extra click on what should be an exceedingly rare event, but in turn you get a framework where you can potentially enter much more data as well. It might even serve as a place for that "documentation-documentation" but I'm nowhere near sure of that yet.

For this I agree that site status isn't a good description. These do indicate ownership or legal authority or something like that.

How about

landholder - entity that holds the estate in land with considerable rights of ownership. https://en.wikipedia.org/wiki/Land_tenure The value chosen for this attribute indicates the general landholder of the site at the time of collection.

Also, here are term definitions

Site Land Status | Site Land Status Term definition
-- | --
Bankhead-Jones Land Use Lands | https://en.wikipedia.org/wiki/Bankhead%E2%80%93Jones_Farm_Tenant_Act_of_1937
US Bureau of Land Management | Bureau of Land Management Property, specific property may be included in higher geography or listed in remarks
county | Property of County name listed in higher geography
US Department of Defense | Department of Defense Property, specific property may be included in higher geography or listed in remarks
US Department of Energy | Department of Energy Property, specific property may be included in higher geography or listed in remarks
tribal | Indigenous Property, specific property may be included in specific locality or listed in remarks
land grant | Land grants were made both to individuals and communities during the Spanish (1598–1821) and Mexican (1821–1846) periods of New Mexico's history. https://en.wikipedia.org/wiki/Land_grants_in_New_Mexico
NPS | National Park Service
US National Wildlife Refuge System | National Wildlife Refuge
private | Private ownership, owner may be listed in remarks
public | Public ownership, owner may be listed in remarks
state | Property of state listed in higher geography, specifics may be listed in remarks
US Army Corps of Engineers | US Army Corps of Engineers Property, specific property name may be listed in remarks
US Bureau of Reclamation | US Bureau of Reclamation Property, specific property may be listed in remarks
US Forest Service | US Forest Service Property, specific property may be listed in remarks or included in higher geography
US Fish and Wildlife Service | US Fish and Wildlife Property, specific property may be listed in remarks or included in higher geography
US National Park Service | US National Park Service Property, specific property may be listed in remarks or included in higher geography
city | Property of a city, city name and specific property may be listed in remarks or included in specific locality

No, what I've proposed is functionally identical from there as well. Enter "bla stage-age" and the numeric-bits follow automagically.

You'll need to enter those in a new "code table" (or something) when you create a new authority, so it'll be one extra click on what should be an exceedingly rare event, but in turn you get a framework where you can potentially enter much more data as well. It might even serve as a place for that "documentation-documentation" but I'm nowhere near sure of that yet.

OK, I just thought that having it right there with the terms would make it easier to change if needed, but this seems to be effectively the same thing.

I would like a different name for "group" - it's a SQL reserved word, and at best it's awkward to pass around and use as a column name and etc. Just about 100% chance it'll find a way to break things in and out of Arctos despite PG's capacity to (mostly) deal with that as a quoted identifier.

tentatively suggest lithostratigraphic_group as the column name and "lithostratigraphic group" as the key value.

I should probably back out and do whatever we do here to formation and bed (and whatever comes next) to be consistent. I've been building those with this pattern:

create table ctgeology_bed (
    bed varchar(60) not null,
    description varchar not null
);

Help!!

lithostratigraphic_group as the column name and "lithostratigraphic group"
lithostratigraphic_formation as the column name and "lithostratigraphic formation"

and so on, seem fine to me.

Done, makes them group nicely.

I'm moving on with lithodemic_suite unless someone has better ideas in the very near future.

lithostratigraphic group, formation, etc. is good with me! So is lithodemic_suite.

How about . . . landholder

I agree, use landholder. Land status is too vague.

OK, I just thought that having it right there with the terms would make it easier to change if needed, but this seems to be effectively the same thing.

I am fine with Dusty's solution.

"Regional Chronostratigraphy" or something like that - not sure I have that parent term completely correct.

I think we should call these "Informal Chronostratigraphy" with a code table description that goes something like this: "Regional and other informal chronostratigraphic units that have been described in published literature, but are not part of the formal International Chronostratigraphic Chart"

Informal is what is used in geology publications to describe terms like these.

I'm liking how the others group in the dropdowns, sorta wish I could say I planned it this way!

Clever suggestions for eg

Stage/Age
Series/Epoch
System/Period

chronostratigraphic_system_or_period makes me cringe a bit. (chronostratigraphic_system/period would definitely break stuff.)

Are a bunch of paleontologists going to show up with pitchforks and torches if we drop half of that? geological_period or chronostratigraphic_system or something of the sort? I think I can infer one from the other, right? (I'll go put up a "keep off the clover" sign just in case mentioning the idea it too much....)

Informal Chronostratigraphy"

Maybe somehow flip that around to group with the rest - chronostratigraphic_informalities???

@dustymc we can cover up the code table names with text in the UI, right? (see #2836 ) I'm pretty sure no paleontologist will care what we call the code tables internally. How about this

chronostrat_system_period
chronostrat_stage_age

chronostrat_informal

Yes, that's going to be a nice layer.

"Sorta" I think is the real answer - it'll probably be sorta-exposed somewhere, almost certainly in the API for example. What will be most visible is the name of the attribute itself, and I like to keep everything as predictable/similar as possible to the code table, and the code table column to the table, for my own sanity.

So I think that works out to something like

table name: ctchronostrat_system_period
table column: chronostrat_system_period
attribute name: chronostrat system period

the attribute name is least-constrained - I've got it under 60 characters, but even that's a bit flexible - want to unabbreviate in some way?

It would probably be preferable to have

table name: ctchronostrat_system_period
table column: chronostrat_system_period
attribute name: chronostratigraphic system/period

@Nicole-Ridgwell-NMMNHS

That's fine.

I don't particularly care about the attribute name, but I think you'll like having them grouped in the dropdown and a predictable "prefix" does that.

Screen Shot 2020-07-10 at 9 00 46 AM

OK, made a little edit above.

table name: ctchronostrat_system_period
table column: chronostrat_system_period
attribute name: chronostratigraphic system/period

This looks good to me!

Maybe somehow flip that around to group with the rest - chronostratigraphic_informalities???

table name: ctchronostrat_informal
table column: chronostrat_informal
attribute name: chronostratigraphy informal

These are all I can identify as "regional" - children of "Chronostratigraphy - Regional Terms," none of which has children of their own. Does that seem correct?

                        1684454 | Stage/Age       | Ionian
                        1685889 | Series/Epoch    | Upper Pennsylvanian
                        1685892 | Series/Epoch    | Middle Pennsylvanian
                        1685897 | Series/Epoch    | Lower Pennsylvanian
                        1685936 | Series/Epoch    | Upper Mississippian
                        1685937 | Series/Epoch    | Middle Mississippian
                        1685938 | Series/Epoch    | Lower Mississippian
                        5339882 | Eon/Eonothem    | Precambrian
                        5340142 | Stage/Age       | Chesterian
                        5340068 | Stage/Age       | Morrowan
                        5340173 | Series/Epoch    | Lower Cambrian
                        5340677 | Stage/Age       | Virgilian
                        5371011 | Substage/Subage | Smithian

If so, the values could go two ways:

Upper Pennsylvanian

with the assumption being that we don't need (presumably not-quite-correct, or formal) ranks on informal terms, or something like...

Upper Pennsylvanian (Series/Epoch)

under the assumption that those ranks are somehow useful, but not useful enough to spawn another series of "more-formal" tables.

Edit: all in ctchronostrat_informal

I'm guessing the second option, but @Nicole-Ridgwell-NMMNHS will say for sure.

Does that seem correct?

Upper/Middle/Lower Mississippian and Pennsylvanian are actually on the international chart - right before the geology table became inaccessible I was going to request these be moved.

The others all seem correct, but off the top of my head I notice at least one unit missing - Wolfcampian Series/Epoch

If so, the values could go two ways:

Yes, option 2.

I can't find our conversation, but all Wolfcampian was entered this way:

GEOLOGY_ATTRIBUTE_4 | GEO_ATT_VALUE_4 | GEO_ATT_DETERMINER_4 | GEO_ATT_DETERMINED_DATE_4 | GEO_ATT_DETERMINED_METHOD_4 | GEO_ATT_REMARK_4
-- | -- | -- | -- | -- | --
Series/Epoch | Cisuralian |   |   |   | Wolfcampian

So if we add Wolfcampian Series/Epoch, we should pull all of those out of remarks and create the attributes. (We can do this after all of Dusty's work is done.)

I grabbed my list of informal chrono, deleted per https://github.com/ArctosDB/arctos/issues/2498#issuecomment-656891392, inserted everything that wasn't in the pruned list, then put the remaining informal in ctchronostrat_informal.

There are no informal values in test so it's not testable, but I checked every step in prod and I'm as certain that my map is accurate as I can be.

Yes fixing the oddities after this is in a better structure would be very much appreciated, and should be a lot easier for all of us as well.

Sounds good!

So if we add Wolfcampian Series/Epoch, we should pull all of those out of remarks and create the attributes. (We can do this after all of Dusty's work is done.)

Added as a note for now in our migration issues.

I believe this is now all migrated in test, searchable in specimensearch, editable in editlocality, displayed in JSONLocality and on specimen detail.

I've still got about a million loose ends to tie up, both UI and structural. It's almost certainly possible to break "interesting" things (eg, delete used attributes from the control tables), and there are places where geology is still visible and locality attributes aren't, and various other states of "not done." Ignoring all of that for now, some confirmation that the existing UIs are functional and sane and that the migrated data isn't horribly mangled in some obvious way would be useful before I start spreading things around too much.

First up - missing term definitions. A lot of these have "pending" does that mean the definitions exist, I just can't see them or that we need definitions?

Before we implement, I'd like to have all terms defined so as soon as I can be sure of what needs a definition and what doesn't, I'll work on those definitions.

Searched by Stage/Age = Priabonian and got results (95 records), but yeah, locality looks crazy in the catalog record:

image

  • added landholder attribute to a locality, but unable to add biostratigraphic zone or informal chronostratigraphy (when selected, no box of choices appears in the value field).
  • TRS township, biostratigraphic zone, block are free text instead of using the code table

  • The TRS range, township and section code tables are not complete and I am unable to locate anything with these attributes

  • searched by Jagger Coal Bed and got results (27 records)

  • searched by Series/Epoch Pennsylvanian and got results (1890 records)

  • searched by Series/Epoch Pleistocene and got
    image

  • searched by System/Period Quaternary and got
    image

Definitions are best done after implementation when you can do it with the normal tools.

I've pulled some authorities from the data (eg after cleaning) and could have missed the definitions - if they exist somewhere I'll be able to pull them in. I'll try to get that in the scripts, I definitely won't drop anything that might have definitions in it.

Good, because a lot of these have definitions that our intern spent forever working on! I don't want to do all of that over....

But the authority tables in test are incomplete by a long shot...

locality looks crazy

Link?

unable to add biostratigraphic zone or informal chronostratigraphy (when selected, no box of choices appears in the value field).

There's no data in test so nothing in the code tables.

I need the text of your errors.

I'll look at the rest in a bit....

Message: ERROR: canceling statement due to user request Where: SQL statement "select array_to_json(array_agg(row_to_json(t))) from ( SELECT SPECIMEN_EVENT.SPECIMEN_EVENT_TYPE as ST, getPreferredAgentName(SPECIMEN_EVENT.ASSIGNED_BY_AGENT_ID) as AB, to_char(SPECIMEN_EVENT.ASSIGNED_DATE,'YYYY-MM-DD') as AD, getPreferredAgentName(SPECIMEN_EVENT.VERIFIED_BY_AGENT_ID) VB, SPECIMEN_EVENT.VERIFIED_DATE VD, --SPECIMEN_EVENT.SPECIMEN_EVENT_REMARK, SPECIMEN_EVENT.COLLECTING_METHOD CM, SPECIMEN_EVENT.COLLECTING_SOURCE CS, SPECIMEN_EVENT.VERIFICATIONSTATUS VS, SPECIMEN_EVENT.HABITAT HB, COLLECTING_EVENT.VERBATIM_DATE RD, COLLECTING_EVENT.VERBATIM_LOCALITY RL, --COLLECTING_EVENT.COLL_EVENT_REMARKS, COLLECTING_EVENT.BEGAN_DATE BD, COLLECTING_EVENT.ENDED_DATE ED, LOCALITY.SPEC_LOCALITY SL, CASE WHEN locality.DEC_LAT IS NULL THEN null ELSE locality.DEC_LAT || ',' || locality.DEC_LONG END CD, CASE WHEN locality.MAX_ERROR_UNITS IS NULL THEN null ELSE locality.MAX_ERROR_DISTANCE || ' ' || locality.MAX_ERROR_UNITS END CE, CASE WHEN locality.ORIG_ELEV_UNITS IS NULL THEN null ELSE locality.MINIMUM_ELEVATION || '-' || locality.MAXIMUM_ELEVATION || ' ' || locality.ORIG_ELEV_UNITS END EL, CASE WHEN locality.DEPTH_UNITS IS NULL THEN null ELSE locality.MIN_DEPTH || '-' || locality.MAX_DEPTH || ' ' || locality.DEPTH_UNITS END DP, LOCALITY.MAX_ERROR_DISTANCE, LOCALITY.MAX_ERROR_UNITS, LOCALITY.DATUM DM, --LOCALITY.LOCALITY_REMARKS, LOCALITY.GEOREFERENCE_SOURCE, LOCALITY.GEOREFERENCE_PROTOCOL, LOCALITY.LOCALITY_NAME, --decode(LOCALITY.WKT_POLYGON,NULL,'','data available') hasLocalityWKT, geog_auth_rec.HIGHER_GEOG HG, --decode(geog_auth_rec.WKT_POLYGON,NULL,'','data available') hasGeographyWKT, getLocalityAttributesAsJson(LOCALITY.LOCALITY_id) GY, getCollEvtAttrAsJson(COLLECTING_EVENT.COLLECTING_EVENT_id) EA from SPECIMEN_EVENT, COLLECTING_EVENT, LOCALITY, geog_auth_rec where SPECIMEN_EVENT.collecting_event_id=COLLECTING_EVENT.collecting_event_id and COLLECTING_EVENT.locality_id=locality.locality_id and locality.geog_auth_rec_id=geog_auth_rec.geog_auth_rec_id and not exists (select locality_id from locality_attributes where locality_attributes.locality_id=locality.locality_id and attribute_type='access' and attribute_value='private') and SPECIMEN_EVENT.COLLECTION_OBJECT_ID=colObjId ) t" PL/pgSQL function getjsoneventbyspecimen(bigint) line 5 at SQL statement
Detail:
Check the Arctos Handbook for more information on errors.

This message has been logged as 67C7C3EA-F093-4BD3-B88FA3D490CBD9D8 Please contact us with any information that might help us to resolve this problem. For best results, include the error and a detail description of how it came to occur in the Issue.

An error occurred while processing this page!
Message: ERROR: canceling statement due to user request Where: SQL statement "select array_to_json(array_agg(row_to_json(t))) from ( SELECT attribute_type as TY, concat_ws(' ',attribute_value,attribute_units) as VU, attribute_remark as RK, determination_method as MD, determined_date as DA, getPreferredAgentName(determined_by_agent_id) as DT from locality_attributes where LOCALITY_ID=lid ) t" PL/pgSQL function getlocalityattributesasjson(bigint) line 5 at SQL statement SQL statement "select array_to_json(array_agg(row_to_json(t))) from ( SELECT SPECIMEN_EVENT.SPECIMEN_EVENT_TYPE as ST, getPreferredAgentName(SPECIMEN_EVENT.ASSIGNED_BY_AGENT_ID) as AB, to_char(SPECIMEN_EVENT.ASSIGNED_DATE,'YYYY-MM-DD') as AD, getPreferredAgentName(SPECIMEN_EVENT.VERIFIED_BY_AGENT_ID) VB, SPECIMEN_EVENT.VERIFIED_DATE VD, --SPECIMEN_EVENT.SPECIMEN_EVENT_REMARK, SPECIMEN_EVENT.COLLECTING_METHOD CM, SPECIMEN_EVENT.COLLECTING_SOURCE CS, SPECIMEN_EVENT.VERIFICATIONSTATUS VS, SPECIMEN_EVENT.HABITAT HB, COLLECTING_EVENT.VERBATIM_DATE RD, COLLECTING_EVENT.VERBATIM_LOCALITY RL, --COLLECTING_EVENT.COLL_EVENT_REMARKS, COLLECTING_EVENT.BEGAN_DATE BD, COLLECTING_EVENT.ENDED_DATE ED, LOCALITY.SPEC_LOCALITY SL, CASE WHEN locality.DEC_LAT IS NULL THEN null ELSE locality.DEC_LAT || ',' || locality.DEC_LONG END CD, CASE WHEN locality.MAX_ERROR_UNITS IS NULL THEN null ELSE locality.MAX_ERROR_DISTANCE || ' ' || locality.MAX_ERROR_UNITS END CE, CASE WHEN locality.ORIG_ELEV_UNITS IS NULL THEN null ELSE locality.MINIMUM_ELEVATION || '-' || locality.MAXIMUM_ELEVATION || ' ' || locality.ORIG_ELEV_UNITS END EL, CASE WHEN locality.DEPTH_UNITS IS NULL THEN null ELSE locality.MIN_DEPTH || '-' || locality.MAX_DEPTH || ' ' || locality.DEPTH_UNITS END DP, LOCALITY.MAX_ERROR_DISTANCE, LOCALITY.MAX_ERROR_UNITS, LOCALITY.DATUM DM, --LOCALITY.LOCALITY_REMARKS, LOCALITY.GEOREFERENCE_SOURCE, LOCALITY.GEOREFERENCE_PROTOCOL, LOCALITY.LOCALITY_NAME, --decode(LOCALITY.WKT_POLYGON,NULL,'','data available') hasLocalityWKT, geog_auth_rec.HIGHER_GEOG HG, --decode(geog_auth_rec.WKT_POLYGON,NULL,'','data available') hasGeographyWKT, getLocalityAttributesAsJson(LOCALITY.LOCALITY_id) GY, getCollEvtAttrAsJson(COLLECTING_EVENT.COLLECTING_EVENT_id) EA from SPECIMEN_EVENT, COLLECTING_EVENT, LOCALITY, geog_auth_rec where SPECIMEN_EVENT.collecting_event_id=COLLECTING_EVENT.collecting_event_id and COLLECTING_EVENT.locality_id=locality.locality_id and locality.geog_auth_rec_id=geog_auth_rec.geog_auth_rec_id and not exists (select locality_id from locality_attributes where locality_attributes.locality_id=locality.locality_id and attribute_type='access' and attribute_value='private') and SPECIMEN_EVENT.COLLECTION_OBJECT_ID=colObjId ) t" PL/pgSQL function getjsoneventbyspecimen(bigint) line 5 at SQL statement
Detail:
Check the Arctos Handbook for more information on errors.

This message has been logged as CA02446F-37A0-41A4-A20077E94956A95F Please contact us with any information that might help us to resolve this problem. For best results, include the error and a detail description of how it came to occur in the Issue.

Also, just ran across one term that we did not discuss. Petrology which is different from bio, chrono and lithostratigraphy.

Petrology - the branch of geology that studies rocks and the conditions under which they form. Petrology has three subdivisions: igneous, metamorphic, and sedimentary petrology. Igneous and metamorphic petrology are commonly taught together because they both contain heavy use of chemistry, chemical methods, and phase diagrams. Sedimentary petrology is, on the other hand, commonly taught together with stratigraphy because it deals with the processes that form sedimentary rock.

The term in test I found under Petrology was Kholmsk Suite (suite) which doesn't have a definition in test, so I am not really sure what to do with this. @Nicole-Ridgwell-NMMNHS ?

Errors are just symptoms of https://github.com/ArctosDB/internal/issues/58. Prod is faster and I don't have indexes yet so it may fix itself in some capacity, but it's still a problem that needs addressed. Turn some stuff off....

In search results, attributes appear like this
image

nice! but suggest it should be called "locality attributes" for consistency.

not as critical, but same goes for JSON locality
image

instead of gy perhaps ly?

unchecked all attributes and other stuff and re-tried Series/Epoch Pleistocene and got

An error occurred while processing this page!
Message: ERROR: canceling statement due to user request Where: PL/pgSQL function getgeographyterm(bigint,text) line 11 at assignment
Detail:
Check the Arctos Handbook for more information on errors.

This message has been logged as A27F3884-2B1A-432D-8CB710EA01DC848F Please contact us with any information that might help us to resolve this problem. For best results, include the error and a detail description of how it came to occur in the Issue.

In search results, attributes appear like this

there are places where geology is still visible

gy perhaps ly?

done

Message: ERROR: canceling statement due to user request

https://github.com/ArctosDB/internal/issues/58

drainage needs cached or removed

Did not log in and Series/Epoch Pleistocene and got results (27,048 records)

then searched by System/Period Quaternary and got results (40,608 records)

BUT

http://test.arctos.database.museum/guid/ALMNH:ES:115 which has the locality attribute Series/Epoch Pleistocene does not show up in the System/Period Quaternary results, although it should because Pleistocene is part of the Quaternary.

looks crazy

Gave it another column to sprawl into.

TRS township,

Screen Shot 2020-07-13 at 2 47 03 PM

biostratigraphic zone,

Screen Shot 2020-07-13 at 2 53 36 PM

block

deleted

I believe that everything you can access from the UI is now fully functional in test, and all geology_attribute references have been removed.

Note that the data in test are very incomplete - there are empty code tables, no definitions, etc., etc., etc.

The geology table still exists (and hasn't been touched); you can use it in WriteSQL to check the migration path. For example

http://test.arctos.database.museum/editLocality.cfm?locality_id=10996030

Screen Shot 2020-07-14 at 10 43 24 AM

Grab the locality_id and use it in Reports/WriteSQL...

SELECT * from geology_attributes where locality_id=10996030

Screen Shot 2020-07-14 at 10 44 07 AM

Break it.

This is ugly!
image
How hard to make it more human readable?

The TRS Section attribute is missing in this locality. Actually, appears to be missing everywhere. Take that back - found it here.

image

image

What's the plan for getting these merged?

image

Just want to make sure we don't set ourselves up for a lot of manual work.

Try to add informal chronostratigraphy and have nothing to choose to try it.

image

Well, it seems to work for me and the localities I checked now have all of attributes as they do in geology attributes.

but the private locality details are still showing up at the top of the record instead of the public locality info when I am viewing as a public user.

image

http://test.arctos.database.museum/guid/NMMNH:Paleo:3006

That's ultimately caused by my test environment. It's not a problem in prod, I manually refreshed that one...

 select update_flat_row(collection_object_id) from flat where guid='NMMNH:Paleo:3006';

and it should be clean

seems to work for me

I can't break it. How to proceed?

@Nicole-Ridgwell-NMMNHS can you mess around with it a bit?

@mvzhuang do you want to look at this? Call me if you have questions!

Anyone at Alaska use this stuff?

@mbprondzinski @aklompma we are moving geology attributes to locality attributes - no change in the data, but the geology table is going to be replaced with a bunch of tables. If you guys want, we can meet up to walk through it.

@aklompma can we meet after our Taxonomy meeting and have Teresa walk us through the change?

Me too.

I am cool with hanging out after taxonomy to walk through this! @dustymc if you are available at 3PM MDT just in case we have technical questions, that would be great!

@Nicole-Ridgwell-NMMNHS @dperriguey @mvzhuang

I tried to log in to the test site, did password reset, but it didn't work?

@Nicole-Ridgwell-NMMNHS I think the mailserver is on strike - use the lost password form again, it should barf out your password, let me know when you have it and I'll turn that back off.

Still not working - I get the email but the temp password doesn't work.

Oh - you were locked, the last PWD you got should work.

In now, thanks!

Can I get edit locality access in the test database?

done

So far, I'm in agreement with what @Jegelewicz has pointed out and have not noticed any additional issues. I like how the locality attributes now self sort when looking at the specimen record.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

acdoll picture acdoll  ·  8Comments

mkoo picture mkoo  ·  3Comments

Jegelewicz picture Jegelewicz  ·  6Comments

ebraker picture ebraker  ·  8Comments

ccicero picture ccicero  ·  8Comments