Arctos: Locality and Locality Attribute Loader/Unloader

Created on 27 Jul 2020  Â·  44Comments  Â·  Source: ArctosDB/arctos

Status: just need locality attribute unloader


Issue Documentation is http://handbook.arctosdb.org/how_to/How-to-Use-Issues-in-Arctos.html

Is your feature request related to a problem? Please describe.

Need a way to load and unload localities and locality attributes.

* Describe what you're trying to accomplish*

Avoid https://github.com/ArctosDB/data-migration/issues/391#issuecomment-663989374

Describe the solution you'd like

4 loaders

  • create locality
  • delete locality
  • create locality attribute
  • delete locality attribute

Describe alternatives you've considered

SQL-load as necessary

Additional context

This seems to be increasingly common for new collections

Priority

Help @ArctosDB/arctos-working-group-officers

This could be done incrementally, with the first (should be quick) step being

  1. a single-user no-pre-checks tool with these fields, which creates named localities
 spec_locality         | character varying(255)      |           |          | 
 dec_lat               | numeric(12,10)              |           |          | 
 dec_long              | numeric(13,10)              |           |          | 
 minimum_elevation     | double precision            |           |          | 
 maximum_elevation     | double precision            |           |          | 
 orig_elev_units       | character varying(30)       |           |          | 
 min_depth             | double precision            |           |          | 
 max_depth             | double precision            |           |          | 
 depth_units           | character varying(30)       |           |          | 
 max_error_distance    | double precision            |           |          | 
 max_error_units       | character varying(30)       |           |          | 
 datum                 | character varying(255)      |           |          | 
 locality_remarks      | character varying(4000)     |           |          | 
 georeference_source   | character varying(4000)     |           |          | 
 georeference_protocol | character varying(255)      |           |          | 
 locality_name       wkt_media_id          | bigint                      |           |          | 

and one which creates attributes against named localities

                                                     Table "public.locality_attributes"
         Column         |         Type          | Collation | Nullable |                              Default                               
------------------------+-----------------------+-----------+----------+--------------------------------------------------------------------

 locality_NAME
 determined_by_agent_id | bigint                |           |          | 
 attribute_type         | character varying(60) |           | not null | 
 attribute_value        | character varying     |           | not null | 
 attribute_units        | character varying(60) |           |          | 
 attribute_remark       | character varying     |           |          | 
 determination_method   | character varying     |           |          | 
 determined_date        | character varying(22) |           |          | 



Location/find/mange named can be used to add and remove temp_ names.

Component Loader Function-DataEntrBulkloading Function-LocalitEvenGeoreferencing Priority-High

All 44 comments

no-pre-checks

Except the expected stuff like, units must match code table, no non-printing characters, etc?

Except the expected stuff like,

I think there are two options:

  1. I can probably sneak something with no pre-checks through ~this week. If you do any of that stuff, the load will just (perhaps cryptically) fail, but with careful data preparation you can load stuff. That "minimal, then evolve as needed" approach has worked well in the past, but perhaps Arctos has outgrown it.

  2. We do this properly from the start, in which case it needs prioritized in light of the 80 other 'next task' Issues, https://github.com/ArctosDB/arctos/issues?q=is%3Aopen+is%3Aissue+milestone%3A%22Next+Task%22.

(1) is just me trying to cheat https://github.com/ArctosDB/arctos/issues/2346 and maybe we should just not do that, but if loading your data is a priority (how to prioritize things in other repositories is another issue-issue...) then I'd rather do it in a way that produces some reusable code, rather than some script that'll take about as much time to develop and then never be used again.

We do this properly from the start, in which case it needs prioritized in light of the 80 other 'next task' Issues, https://github.com/ArctosDB/arctos/issues?q=is%3Aopen+is%3Aissue+milestone%3A%22Next+Task%22.

Yes, please. I'm not in a HUGE rush - we could go a week or two before it becomes a roadblock.

That list is overwhelming, but I would say that you follow whatever we laid out at the last AWG meeting, then proceed to fix bugs, priority critical, priority high, priority normal, no priority....

Yes, I am excited for this to happen! It will be extremely useful.

There's a locality-loader in test.

Woo hoo! - But be patient - I may not get to test it until Monday or Tuesday.....

Got to this - now waiting to see what happens.

image

Loaded stuff yesterday afternoon. It is still sitting there.

image

I have some observations.

  1. The plan above is to have a separate locality attribute loader, but it sure would be easier if I could load them with the locality. Not saying I hate the plan, just saying additional steps is more work. In any case, a locality attribute bulkloader will be needed for the times when attributes get added based upon new information.
  2. When I add a locality manually, I am able to enter coordinates in formats other than decimal degrees and have them converted to decimal degrees. I think this should be part of this tool as well, otherwise I will have to spend a bunch of time converting stuff before I load anything.
  3. The main bulkloader uses higher_geog, but this one uses higher_geography let's be consistent whatever happens.
  4. A drop-down menu of options here would be useful as well as definitions of the available options.

image

  1. Instead of the run-together "check all check none" text, perhaps a couple of buttons?

Also, I will renew my offer to help with the UI stuff, but I would need to be able to see the test code and the ability to make edits that we could review. This tool is going to be super-helpful, but the current set-up and visual appearance are not easy for me to understand, so I doubt it will be super-intuitive to anyone else.

I would like to use this opportunity to set up a standard tool page structure so that all of these tools appear similar to users:

TOOL NAME (this should always be at the top of the page)

What it does - or link to documentation

See list of fields (can we make this "click to expand"

get a template

Load csv

After data is loaded, you get the "review" page with options to:

TOOL NAME VALIDATION

Delete it all
Validate
Delete checked items
Load
Download

YOUR DATA HERE

If things are going to autoload upon validation, then we can eliminate the "load", but hopefully this gives you an idea of where I am going? I'd be happy to try to mock this up in test with this tool - I just don't know where to find the test code or if anyone really wants me messing around with it....

Absolutely agree with standardized tool UI and vocab. This approach looks
good to me.

On Sat, Aug 1, 2020 at 9:26 AM Teresa Mayfield-Meyer <
[email protected]> wrote:

  • [EXTERNAL]*

Also, I will renew my offer to help with the UI stuff, but I would need to
be able to see the test code and the ability to make edits that we could
review. This tool is going to be super-helpful, but the current set-up and
visual appearance are not easy for me to understand, so I doubt it will be
super-intuitive to anyone else.

I would like to use this opportunity to set up a standard tool page
structure so that all of these tools appear similar to users:

TOOL NAME (this should always be at the top of the page)

What it does - or link to documentation

See list of fields (can we make this "click to expand"

get a template

Load csv

After data is loaded, you get the "review" page with options to:

TOOL NAME VALIDATION

Delete it all
Validate
Delete checked items
Load
Download

YOUR DATA HERE

If things are going to autoload upon validation, then we can eliminate the
"load", but hopefully this gives you an idea of where I am going? I'd be
happy to try to mock this up in test with this tool - I just don't know
where to find the test code or if anyone really wants me messing around
with it....

—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/2967#issuecomment-667548116,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBAPHGRDF3SERWEB77DR6QX3BANCNFSM4PI57HKQ
.

now waiting

https://github.com/ArctosDB/arctos/issues/2974#issuecomment-667240183

coordinates .... formats...part of this tool as well,

Hmmm. Lots of complexity, a big step away from a "just loads" model, but I do see the point, sorta. Separate coordinate-converter? The stuff Arctos converts is just math, should be pretty easy to do that in Excel too. Also of note the other things that convery coordinates either plonk them on a map (editlocality) or have a verbatim-place for them (specimen data entry). Blindly converting is a little scary for me for some reason.

higher_geog

Wilco

drop-down

There are an infinite number of things that can be communicated there.

definitions

Autoload autoloads. Other stuff don't.

buttons

wilco

see the test code

https://github.com/ArctosDB/PG

edits

You'll need a TACC account so you can pull it to the server.

set up a standard tool page

Oh please - new issue I think, more eyes is better eyes.

TOOL NAME (this should always be at the top of the page)

yup

What it does - or link to documentation

yup

See list of fields (can we make this "click to expand"

get a template

Load csv

Needs discussion. I like those separate, especially for something where 99% of the usage will be "approving" student records and such (and some/maybe-most of these will fit that mold). But yea all that stuff exists in some predictable place.

After data is loaded, you get the "review" page with options to:

TOOL NAME VALIDATION

Not really validation in this model - more like TOOL NAME TOOLS...

Delete it all

Yea no, deleting stuff you can't see inevitably gets me yelled at, and this just became a multi-user tool. Deleting the stuff that someone uploaded after your page had loaded and before you clicked the button would be 'interesting....'

Validate

That's not in the model. It would be a tremendous amount of extra work, would involve a lot more interaction (so complicate everything for users), and would still be imperfect. I'm pretty enthusiastic about at least trying to the go/nogo approach.

Delete checked items
Load
Download

YOUR DATA HERE

If things are going to autoload upon validation, then we can eliminate the "load", but hopefully this gives you an idea of where I am going? I'd be happy to try to mock this up in test with this tool - I just don't know where to find the test code or if anyone really wants me messing around with it....

wants me messing around

Good question, short answer is "yes" for me, longer answer is that the way I develop might make that sort of a pain. At least to get started I could clone whatever you want to mess with or something.

Is the bulkload locality attributes in process? For me, the bulkload locality is useless without a bulkloader for the attributes as well.

in process?

Unless someone has some major objection to the locality and identification loader format like NOW I should have something for you to test this afternoon, and it should go to production tonight.

No way to combine these in a single form / interface?

On Mon, Aug 3, 2020, 10:32 AM dustymc notifications@github.com wrote:

  • [EXTERNAL]*

in process?

Unless someone has some major objection to the locality and identification
loader format like NOW I should have something for you to test this
afternoon, and it should go to production tonight.

—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/2967#issuecomment-668118908,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBCRMD7ISKQIOWYSY4LR63Q77ANCNFSM4PI57HKQ
.

combine these in a single form / interface?

Not without limitations, which would be the same as any other denormalization, plus whatever that would mean for loading additional attributes against existing localities. That'd all add up to greatly increased complexity, which seems to be a central feature in complaints about loaders and so something I think we should aggressively avoid.

Suggest this page include:





Bulkload Locality Tool

        </p>
                    <p>
            This tool allows Arctos operators with the <a href="https://arctos.database.museum/Admin/user_roles.cfm">Manage Collection</a> role to review localities created as data entry extras during object record data entry or localities uploaded directly using this tool. You may manage data for users in your collection(s) with this form. 
        </p>
        <p>
            If you wish to use this tool to load localities, visit <a href="bulkloadLocality.cfm?action=ld">Load from CSV</a> where you will also find field documentation and a template.
        </p>
        <p>
            The table below includes sets of localities that have been created but require review and approval. Use the  text links to take the following actions:

  • Change status: review individual localities, flag data for further review or approve it to load. Managing status is limited to #recordLimit# records, you may need to use status to organize the data into manageable chunks.
  • Get csv: useful for data that has errors. Download the csv, delete the data from the tool and re-upload corrected data.
  • Delete: this will remove the localities from the tool. It is advisable to download csv before deleting anything from this form.

In-process data

User Ctl Bits-n-Pieces
#username#
  • change status

  • get CSV

  • delete
  • select status,c from usrs where username='#username#' order by status
    Status Count Ctl
    #status# #c#
  • change status

  • get CSV

  • delete
  •                 </td>
                </tr>
    

    I think adding bullets to the options makes them more visible. I have stuff for the other two pages - but I need to go eat something....

    Thanks! Some of that's implemented. I didn't put the CSS in, that should be done site-wide. https://github.com/ArctosDB/arctos/issues/2893 is ideal or https://github.com/ArctosDB/PG/blob/master/includes/style.css is handling that for now.

    Suggest this page include:





    Bulkload Locality Tool

    Approve or flag records

    The localities below can be set to load by changing their status to autoload (or any text that begins with "autoload"). You may also flag localities for further review by changing their status to anything other than autoload. Loading may happen at any time, including while records are being reviewed so change status to autoload% with care. Records are deleted from this tool as they load.

                        <p>
                Localities that do not pass data quality triggers will not load and will continue to appear on the main <a href="http://test.arctos.database.museum/tools/bulkloadLocality.cfm">Bulkload Locality Tool page</a>. Localities with errors can be downloaded from there for correction and re-upload.
            </p>               
                        <ul>
        <li>
            Return to the <a href="bulkloadLocality.cfm">Main Locality Bulkload Tool Page</a>
        </li>
    </ul>
         <p>
                You can use the quick status change buttons to do the following:
          <ul><li> Check None: uncheck all available localities </li>
         <li> Check all: check all available localities </li>
         <li> Check all-->status autoload: check all available localities and set their status to autoload </li>
         <li> Status-->autoload: enter autoload in the change status to box (you will need to manually check some localities and select "Change status for checked records" to change the status of any localities). </li>
          </ul>
         <br></br>
    

    Enter "autoload" to load the selected localities. Enter any other term you wish to use to organize data or flag localities for further review.

         <p>
        Change Status for checked records to
    </p>
    

    https://github.com/ArctosDB/PG/blob/master/includes/style.css is handling that for now.

    Whew - that is a hot mess. I see this:

    h1 {
    font-size:2em;
    font-weight: bold;}

    h2 {
    font-size:1.6em;
    font-weight:bold;}

    Will it work for this tool page if you put

    Bulkload Locality Tool

    there?

    Can we add other options (h3, h4, etc?)

    OK and last but not least this page.
    h1 {
    font-size:2em;
    font-weight: bold;}

    h2 {
    font-size:1.6em;
    font-weight:bold;}

    Bulkload Locality Tool

    Load localities

    This tool allows Arctos operators to load localities separately from catalog records or events. Localities loaded using this tool will appear on the main Bulkload Locality Tool page. From there they can be approved for load or flagged for further review.

                <div class="importantNotification">
              <ul>
                         <li>This form will happily accept and create duplicates. Proceed with caution.</li>
                  </ul>
                  <ul>
                          <li>It is advisable to keep a copy of any data uploaded here until you have confirmed successful completion.</li>
                   </ul>
    
    • You can load with status set to autoload, in which case Arctos will load all localities that pass the data quality triggers with no possibility of review. You can also set other statuses to help group localities for later review. ANY status other than one that begins with "autoload" will result in localities available for review on the main Bulkload Locality Tool page.
                   </div>
          <p>If you are not ready to load a comma-delimited text file (csv) you can return to the <a href="bulkloadLocality.cfm">Main Locality Bulkload Tool Page</a>
           </p>
          <p>If you need a template to prepare a comma-delimited text file (csv) for this tool, you can <a href="bulkloadLocality.cfm?action=makeTemplate">get a template here</a>.
                   </ul>
    </div>
        <p>
    If you have your comma-delimited text file (csv) prepared with column headings spelled exactly as below, you can load it below.
        </li>
    </ul>
    

    Any chance I could convince you to put the "Browse" and "upload this file" buttons above the grid of fields for the csv?

    This is in prod.

    • create locality - is done
    • delete locality - doesn't seem necessary, Arctos automation does that.
    • create locality attribute - done
    • delete locality attribute - todo

    Also need to integrate https://github.com/ArctosDB/arctos/issues/2967#issuecomment-668228080

    @Jegelewicz I made your suggested changes and added

    /* inlinedocs styles "this is how this form works" sorts of information. */
    
    inlinedocs{
        border:2px solid green;
        margin:1em;
        padding:1em;
    }
    

    to the main CSS file.

    (Re)styling headers should probably be a dedicated issue; it should propagate to this, whatever we do or don't do, but will also affect lots of other stuff.

    Let me know if anything else need changed, I'll use http://test.arctos.database.museum/tools/bulkloadLocality.cfm?action=ld as a template for all component-loaders once we're happy with it.

    There are recommendations for this page here.

    @Jegelewicz they're in test, not prod

    Used the locality attribute loader to load up 11K plus biochron attributes. YAY

    from https://github.com/ArctosDB/arctos/issues/2987#issuecomment-687181571

    @dustymc all locality attributes in this file need to be DELETED. Let me know if you have questions.

    Can I ignore anything - you I need to match all columns in your data with all columns in the data (which will handle any situation, but require a great deal of precision in preparing the file), or should I try to match only on type+value+units, or ????

    Or we could make this a two-step process, where a file like this can be used to find locality_attribute_id and the actual unloader takes only that.

    Basically, how should the tool work?

    Well, these were downloaded directly from Arctos, so they should match exactly in all fields. I think that this should be true for a tool that can be used to remove locality attributes, as there may be two of the same attribute and I only want to remove the one without the remark or determined by XYZ.

    As for using the ID, that would be the most precise. So if the SQL I used to get these was modified to include the attribute ID in the results that would work. Of course any time someone wants to delete a bunch of attributes, the SQL will probably be different....

    BUT that could be remedied with a tool for searching locality attributes to retrieve localities using them.

    Just realized that dates in that file aren't formatted correctly.

    Biostrata to delete.zip

    SQL

      locality.locality_name,
      locality_attributes.locality_attribute_id,
    ...
    

    probably be different.

    I think that would have to be part of the form, or something of the sort. Anyway that's seeming more complex than I want to deal with for something that probably won't get much use at all, so let's go with value-matching for now.

    There's a very crude form at https://arctos.database.museum/tools/bulkUnLoadLocalityAttribute.cfm. It only goes through the "find attrID" stage at the moment. You can start over or jump in at https://arctos.database.museum/tools/bulkUnLoadLocalityAttribute.cfm?action=checkConfirm - please spot-check a few, then let me know when action=checkConfirm contains the stuff you want to delete and I'll add the also-crude scary final step.

    Looks like these will all work

    image

    BTW - I ran the entire file.

    I did this, because I'm paranoid

    arctosprod@arctos>> select current_timestamp;
           current_timestamp       
    -------------------------------
     2020-09-04 12:17:13.255768-05
    (1 row)
    
    Time: 0.763 ms
    arctosprod@arctos>> create table temp_cache.locality_attrs_20200904 as select * from locality_attributes;
    SELECT 284990
    
    

    and there should be a shiny new and completely untested link at the bottom of checkConfirm - might be a good idea to try it out with a few first.

    Let me know if this works for now and I'll patch it back into test, then we can de-crude things as time allows.

    Not sure what happened, but tried to load a small test file (first few from the big file). After selecting upload got this

    image

    Needs determiner name mapped to attribute_determiner, not ID

    Worked like a charm.

    Cool, it's patched back into the mainstream, I'll add it to the menus, this issue can be reprioritized.

    Well, spoke too soon. So somehow the attribute determiner is hosing up the validation

    image

    The first one has a determiner of Thomas E. Williamson, which is indeed agent ID 21319446, but of course, 21319446 doesn't equal "Thomas E. Williamson". Make sense?

    image

    Change

    locality_attributes.determined_by_agent_id,
    

    to something like

    getPreferredAgentName(locality_attributes.determined_by_agent_id) as attribute_determiner,
    

    I updated the temp table in case that's useful

    arctosprod@arctos>> update cf_temp_locality_attribute_unloader set attribute_determiner=getPreferredAgentName(attribute_determiner::bigint);
    UPDATE 13708
    Time: 116.257 ms
    arctosprod@arctos>> select distinct attribute_determiner from cf_temp_locality_attribute_unloader;
     attribute_determiner 
    ----------------------
     Roger Mongold
     Lloyd Chino
    
     Adrian P. Hunt
     Peter K. Reser
     Paul L. Sealey
     unknown
     Natal V. D'Andrea
     Tim Bartos
     Thomas E. Williamson
     Renee Anderson
     Gary S. Morgan
     Spencer G. Lucas
     William Cotton
     Ian Jones
     Michael Bircheff
     Phil Bircheff
    (17 rows)
    

    This works well - I am going to work on the visual appeal - probably later in October.

    Excellent timing - see https://github.com/ArctosDB/arctos/issues/2974#issuecomment-693522569 for visual appeal. I'll post the start of a template there in a bit.

    We still need a locality unloader? Otherwise I think this can be closed.

    I see no reason for a locality unloader - there's a locality unnamer, and unnamed+unused localities unload themselves.

    Was this page helpful?
    0 / 5 - 0 ratings

    Related issues

    dustymc picture dustymc  Â·  4Comments

    AJLinn picture AJLinn  Â·  3Comments

    Jegelewicz picture Jegelewicz  Â·  5Comments

    DerekSikes picture DerekSikes  Â·  3Comments

    ebraker picture ebraker  Â·  8Comments