The "data entry extras" functionality isn't as good as it could be, loading large batches of various components (identifiers, parts, identifications, etc.) causes timeouts then problems/confusion, the "claim" process for managing data entered via 'data entry extras' causes problems, etc. Let's fix it.
Very tentative suggestions, which may or may not hold up to reality:
status in "not-yours" records (which would necessarily come with access to records created by users with whom you share collections)A normal load would then be
"Approving" records loaded by you or your students/techs/associates, via any process including data entry extras, would be
I think that would be a significant simplification in both the code and the user experience. "Manage your..." might come with a "pick users" option (a slight increase in complexity), but most of the rest of the complexity (claim, find guid, etc.) that's been introduced for various reasons could be removed.
This has some urgency, I'd like to use https://github.com/ArctosDB/arctos/issues/727 as a proof of concept, so I'm adding scary labels and will interpret a lack of immediate objections as enthusiastic approval.
I'm up for trying this method. Anything we can do to simplify and make the process consistent across tools would be nice.
The basics of this are running in test with bulkload identifications. I think its worked out even better than anticipated, but timely feedback would be appreciated.
Replace the "claim" functionality with an ability to change status in "not-yours" records
The form is limited to manage_collection in order to safely (I hope!) accommodate this, and there's a new "shares collection" function which DOES NOT exclude users with locked accounts (so you can load things created by former techs & etc.).
Rebuild the loaders to resolve UUIDs without first fetching guid_prefix
This is implemented and tested, needs propagated to all other loaders
"validate" is part of the load process; there's no pre-validation. (Having this as a separate step has been a source of confusion for some time, this process facilitates a much simpler go/nogo approach.)
Todo, pending nobody finding a reason to go in a different direction:
Now I gotta dig up some stuff to load....
/remind me to work on this tomorrow
@Jegelewicz set a reminder for Jul 31st 2020
Cool. I will try too.
On Thu, Jul 30, 2020, 2:25 PM reminders[bot] notifications@github.com
wrote:
- [EXTERNAL]*
@Jegelewicz https://github.com/Jegelewicz set a reminder for Jul 31st
2020—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/2974#issuecomment-666670119,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBHA34YUBTKIKFLHGHDR6HJKJANCNFSM4PL36M5A
.
:wave: @Jegelewicz, work on this
Another major point for this model: it makes replication easy, there's now a testable locality-loader. I'll stop until I get some feedback, I don't want to replicate any problems.
The loader-scripts aren't scheduled, you can just open http://test.arctos.database.museum/ScheduledTasks/component_loader.cfm to process from the two new loaders.
When I follow that link - I get a white screen.

Let's go to Vegas!

white screen.
Yea it's not very interactive - check back with the data, should be different. https://github.com/ArctosDB/internal/issues/65
Vegas
Sorry, I broke it!
OK, one more observation, when stuff won't load, it would help to get the error along with the csv when you download to fix stuff.
So I was able to load 10 localities - none had coordinates - I'll see if I can find a couple that do to try.
Also http://test.arctos.database.museum/ScheduledTasks/component_loader.cfm to process from the two new loaders. Needs to have some kind of interactivity...once you go there, you don't get out and we need people to understand that they have accomplished something. Assuming this will be true in production.
get the error
wilco
interactivity
That's just test - it'll be on the scheduler in production, loading (or errors) will just happen (including for any number of records).
Clarification - So when I load a file directly to the tool, if stuff passes all the triggers, does it just load or will it always show up in the "manage" page first. Don't know why I can't decide what happens....
You can load with status, and if you load with it as "autoload" then Arctos will take care of the rest (or make errors). If you follow the instructions and load from a fresh template then you'd need to set status (which gives you an opportunity to notice that you've just loaded 4582 duplicates...). How that's implemented and documented is a little waffly at the moment, but the potential for "stuff just happens" exists.
This is in prod, need to integrate eg https://github.com/ArctosDB/arctos/issues/2967#issuecomment-668228080 and rebuild all component-loaders under this umbrella.
Dropping priority.
Need to check throttle; currently set for 10 records per run, can be upped significantly but needs monitored as things are added.
It's more difficult than necessary to keep things synced up under this format.
@Jegelewicz see
http://test.arctos.database.museum/tools/_BulkloadComponentTemplate.cfm
https://github.com/ArctosDB/PG/blob/master/tools/_BulkloadComponentTemplate.cfm
I threw a demo table together and plugged it into the template, so the template itself should be fully functional for the time being. Everything should be clear from the comments in the source code. (!) Let's get that as close to perfect as possible before moving on to messing with any actual loaders. Let me know how I can help!
Working on this.
Managing status is limited to 2,500 records, you may need to use status to organize the data into manageable chunks.
I think this is going to need documentation or an example of how to make it work.
I just made a bunch of changes and committed. One may not function appropriately, but the rest are just cosmetics.
insert a lot of stuff
CREATE OR REPLACE FUNCTION temp () RETURNS varchar AS $body$
declare i int;
begin
for i in 1..50000 loop
insert into cf_temp_demotable (
random_varchar_field,
random_bigint_field ,
username
) values (
'someval'||i::text,
i,
'dlm'
);
end loop;
return 'k';
end;
$body$
LANGUAGE PLPGSQL
SECURITY DEFINER
volatile;
select temp();
change some of them


back to manage
select the NULL group, get 2500, change them, rinse and repeat


the 2500 ceiling is an arbitrary "probably won't eat any decent browser with normalish data" value as well - it could be adjusted, even by users.
I'm not seeing your commits - are they in pg master? FYI I'm about to drop out at least today, maybe tomorrow (unless things get really crazy...). Can you pull from the webserver - I think I heard from TACC that you now have superpowers, not sure how far they go.
Can you pull from the webserver - I think I heard from TACC that you now have superpowers, not sure how far they go.
Need instructions - will try!
Don't know what happened. When I hit commit, it said you had committed while I was working - guess nothing got changed?
I can re-do and commit.
Once youre inside tacc (they'll probably send you through data.tacc.utexas.edu)
HAHAHAHA you act like I know what any of that means! Seriously, will probably need a tutorial, but will see if I can figure it out. Just made a commit and it stuck this time.
tried and got this
The authenticity of host 'arctos.tacc.utexas.edu (129.114.60.95)' can't be established.
RSA key fingerprint is SHA256:zkrOpVyFihOtqrmbGPClLoNdVBYfwb4aLRHCMOZJiJ0.
Are you sure you want to continue connecting (yes/no)?
"yes"
When I select delete for a group

I was presented with everything being deleted.

but this only happened when I selected the bunch with NULL status. The other stuff worked.
selecting this

got me this

Tried instructions and got
[tmayfiel@arctos ~]$ cd /usr/local/webroot
[tmayfiel@arctos webroot]$ git pull
error: cannot open .git/FETCH_HEAD: Permission denied
Permission denied
Maybe now....
NULL status.
Confirmed, I'll update
I think null status is behaving predictably now
OK - Once I can review stuff I did I think it is pretty much ready for prime time. However, I definitely want a few other people to test it out and provide feedback.
Just made some cosmetic tweaks and committed.
Permission deniedMaybe now....
Still can't pull. Same message.
What does
uname -a
have to say?
Linux arctos.tacc.utexas.edu 2.6.32-504.8.1.el6.x86_64 #1 SMP Wed Jan 28 21:11:36 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
You're in the right place so it must be permissions. I jiggled the wires, try again....
Better but still some kind of denial?
remote: Enumerating objects: 36, done.
remote: Counting objects: 100% (36/36), done.
remote: Compressing objects: 100% (36/36), done.
remote: Total 36 (delta 24), reused 12 (delta 0), pack-reused 0
Unpacking objects: 100% (36/36), done.
error: Unable to append to .git/logs/refs/remotes/origin/v1.2.1.5: Permission denied
From https://github.com/ArctosDB/PG
! 4bcb6c4..91b59cb v1.2.1.5 -> origin/v1.2.1.5 (unable to update local ref)
Woo Hoo! It worked tho!
Really need other eyeballs on http://test.arctos.database.museum/tools/_BulkloadComponentTemplate.cfm
@campmlc @mkoo @lin-fred @Nicole-Ridgwell-NMMNHS (Lindsey and Nicole - I know this is probably out of the blue, but I'd like someone to try out this "template" loader. Call me if it seems crazy!
How do I login to the test database?
Second! I'm planning to copy this template to dozens of utilities. Super-easy to fix or adjust about anything now, and then increasingly less so as it gets used.
@ArctosDB/arctos-working-group-officers
How do I login to the test database?
Try using your login in regular Arctos - you may have to recover your password.
Nevermind I figured it out
The test csv I uploaded didn't appear in the review panel. Is that an issue with my all-caps username again like what happened with the locality loader?
reload - I pushed the patched function to test
can someone unlock my account

@lin-fred you should be unlocked and have email from Arctos
Tried to delete a batch, it says "To delete, scroll through the table below." but there is no table below.

That tiny thing IS the table because you only have one group selected.
BUT - I do see that I copied over the actual DELETE function - let me go fix!
I get this when I try and upload a csv file. Does this mean there is something wrong with the data? This is a batch that was successfully loaded previously

Can you attach the file?
Yes it is this one
That is a bulkload file. This demo tool takes a file with just three coulumns - see Get a template to get the headers.
Oh I see! I completely misunderstood. Thanks.
Once you get the template, just make some stuff up to put in the columns. First column is any characters, second is just numbers.
@dustymc I made a fix for https://github.com/ArctosDB/arctos/issues/2974#issuecomment-698557529 committed and pulled but the change doesn't show up yet - what kind of lags are there?
The stuff I set to autoload isn't loading - is that because this is just a test and the column headers don't actually load to anywhere?
The stuff I set to autoload isn't loading - is that because this is just a test and the column headers don't actually load to anywhere?
I believe that is true....
When I click Review, my first record shows up at the end of the all of the records, instead of at the beginning where I'd expect it to be?
There are no lags, sure it pulled?
-bash-4.1$ git pull
From https://github.com/ArctosDB/PG
534594a..56737b8 v1.2.1.5 -> origin/v1.2.1.5
Updating 8cd7ab9..6222f4d
Fast-forward
tools/_BulkloadComponentTemplate.cfm | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
-bash-4.1$
The loader is a separate component (and will be for all "component loaders", eventually/probably) - it'll load or return an error (and has email notifications and such). The loader and "manipulator" is all that needs tested at this time.
This is just a demo for the purposes of getting the template functional - there's no load, and no place to load the load to if there was.
There's no such thing as "first" or "last" unless we make it explicit, and I don't think there will ever be a way or reason to add that to a component loader.
There are no lags, sure it pulled?
Here is what happened when I pulled
[tmayfiel@arctos webroot]$ git pull
remote: Enumerating objects: 28, done.
remote: Counting objects: 100% (28/28), done.
remote: Compressing objects: 100% (28/28), done.
remote: Total 28 (delta 16), reused 8 (delta 0), pack-reused 0
Unpacking objects: 100% (28/28), done.
From https://github.com/ArctosDB/PG
a7da191..6222f4d master -> origin/master
error: Unable to append to .git/logs/refs/remotes/origin/v1.2.1.5: Permission denied
! 534594a..56737b8 v1.2.1.5 -> origin/v1.2.1.5 (unable to update local ref)
[tmayfiel@arctos webroot]$
So maybe the error means it's still not working for me.
@Nicole-Ridgwell-NMMNHS try delete now.
I was able to delete some of nicoles, is that supposed to happen?
This tool allows Arctos operators with the Manage Collection role to create and review demo-things. Review and load data entered by users in your collection(s) here.

You
so yes - that allows you to manage things she's entered via data entry extras and such. It of course comes with some risk as well - you grabbed CSV before you deleted, right?! - but I think it's acceptable, and it's a huge simplification over the "claim stuff from your users" mess that I don't think anyone ever really understood. I think it's a good approach, but now's a really good time to iron this out if it's not!
@Jegelewicz minimally that bit of documentation quoted above probably needs expanded/clarified/something.
minimally that bit of documentation quoted above probably needs expanded/clarified/something.
How about this:
This tool allows Arctos operators with the Manage Collection role to create and review demo-things. Be aware that if you share the Manage Collection role with another Arctos user for any collection, you will be able to review, load and delete their data. Review and load data entered by users in your collection(s) here.
Delete works, yay! Playing around with what can go into status, punctuation yes, special characters no. I don't particularly see any reason why this field would need to accept special characters.
On the review screen, where you can check things to change status, is there a way to make that so you can bulk select part of the list (like holding shift and it selects everything in between)? Although it is easy enough to delete the whole set, edit the CV with the status, and reload.
where you can check things to change status, is there a way to make that so you can bulk select part of the list (like holding shift and it selects everything in between)? Although it is easy enough to delete the whole set, edit the CV with the status, and reload.
One thing we don't get to see here is that there will probably be a validation step (@dustymc should we add something to the template for that?) and after validation some things will come back with error statuses (which will group together naturally), then you could load everything that validates, and download the stuff with errors for correction.
@dustymc correct me if that's crazy!
after validation some things will come back with error statuses
that is what happened with the locality loader
validation step
That's integrated with loading, which is a big part of the magic in this approach.
That's WAY simpler for everybody, there's always one predictable path to loading, and the errors are always comprehensive - they're the things I might think to pre-check, plus the things I'd miss in a pre-check but the DB would catch. Yay everybody!
Special characters are handled a bit better, but it's still almost certainly possible to feed the form something it can't deal with.
If ???????��????? were entered as special characters, it's almost certainly a browser problem (that's probably doing similar things elsewhere). See https://handbook.arctosdb.org/documentation/encoding.html
I'll play with "select many" - I don't have terribly high hopes, but maybe....
I should have had much higher hopes - you can shift-select now.
I expanded on the shared collection thing a bit, and split it into a new paragraph from thisFormPurpose
you can shift-select now
Should we keep this a secret or put instructions for doing it in the form?
Thanks to whoever answered my question! I feel like there is a hidden message in there...

I should have had much higher hopes - you can shift-select now.
This is great! I was wondering how nicole had separated out ~300 of them, which I'm guessing she clicked it all. Glad there is a way to mass click!
Priorities:
From AWG meeting: need to link to "extras" from specimen loader; component loaders should respond to ?uuid={bulkloader.uuid}
UUID handler added
Classification loader updated.
locality loader updated
Everyone try testing by 3rd Dec please!
I've been scattering this thing around for a while..... Should I stop?
Maybe for now and let's see if anyone has any comments/suggestions....
I think i need the taxonomy name bulkloader updated. I can't test the classification loader from my file because there are some names not in the database.
Names are already loaded, but ran into validation timeout
@mvzhuang if you want to send me the new names I'll figure it out, this is on pause so it might be a while before I can rebuild any more loaders
BulkloadTaxonomyAnts.csv.zip
OK here it is. it needs to be checked against the existing names though, so there might be names already in the database.
Hey Arctos - Reminder to try and test this by next Thursday!
See https://github.com/ArctosDB/arctos/issues/3300 - make sure status (which can be errors) is urlencoded when necessary
This has served its purpose, there's a template, it's awesome, closing.
@gradyjt
The next two tasks on my list (https://github.com/ArctosDB/arctos/issues/2556, https://github.com/ArctosDB/arctos/issues/2442) rely on this template. I can't seem to reconcile https://github.com/ArctosDB/arctos/issues/3413 and the related AWG discussion. Do we love this or hate it? Can I keep building these things or do we need more discussion? Do I need to change something going forward? Do I need to change something with the ~dozen loaders I've already built under this model?
I think the tool is fine. It is just the related "documentation" that needs update, but others should weigh in.
I think we could just add to the documentation, and I also suggested that
rather than change the text on all the different component loaders, we
change the main bulkloader to use the same terminology, e.g. mark to
autoload vs mark to load etc - with a few sentences explanation on that
form only.
On Tue, Feb 16, 2021 at 1:51 PM Teresa Mayfield-Meyer <
[email protected]> wrote:
- [EXTERNAL]*
I think the tool is fine. It is just the related "documentation" that
needs update, but others should weigh in.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/2974#issuecomment-780109594,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBEV7MQA2RFSXEBCW53S7LLDZANCNFSM4PL36M5A
.
main bulkloader
If you mean the catalog record bulkloader, these are fundamentally different tools. The catalog record bulkloader is an independent tool - things in it load or error, that's it. "Component loaders" can have dependencies - things can hang around with 'autoload: ....' for weeks, then be processed after related data becomes available. That is, component loaders have three exits:
I would be in favor of changing the actionable value of loaded to "autoload" rather than NULL for the catalog record bulkloader, but that should be addressed in a new issue.
Maybe we need to think about some way to let people jump to a specific set of data in tools like https://arctos.database.museum/tools/BulkloadOtherId.cfm
Currently there is an extra-long list of errors in there and if my username was after this person alphabetically, I'd have to scroll forever to get to my stuff.
This is just one page of it

Maybe just a table at the top that lists the usernames and lets you jump to a specific user's stuff?
See https://github.com/ArctosDB/data-migration/issues/450#issuecomment-784555912 - verbose errors 400 lucee, need to POST or truncate errors or something.
Untested workaround: filter only on username, change status to something shorter.
@Jegelewicz
Thanks, that worked for my stuff...
v1.1: csv download should include this to strip unnecessary columns
<cfset flds=mine.columnlist>
<cfif listfindnocase(flds,'key')>
<cfset flds=listdeleteat(flds,listfindnocase(flds,'key'))>
</cfif>
<cfif listfindnocase(flds,'last_ts')>
<cfset flds=listdeleteat(flds,listfindnocase(flds,'last_ts'))>
</cfif>
....
<cfset csv = util.QueryToCSV2(Query=mine,Fields=flds)>
Moved unfulfilled requests to https://github.com/ArctosDB/arctos/issues/3463, closing (again).
Most helpful comment
I should have had much higher hopes - you can shift-select now.
I expanded on the shared collection thing a bit, and split it into a new paragraph from
thisFormPurpose