In a meeting with GBIF North American node manager today, she asked if there are any Arctos collections interested in publishing to the Ocean Biodiversity Information System OBIS. I'm not sure how this would be organized, but I hope it could be done via the VertNet ipt. If anyone is interested, let me know and I'll try to set up a discussion about it.
@sharpphyl @acdoll @lin-fred
https://obis.org/manual/contribute/ says they speak DWC so I don't think anything would need "published," just tell them where the DWC files are (http://ipt.vertnet.org:8080/ipt/).
So these folks harvest from WoRMS?
They also have marine mammals and birds.
On Tue, Mar 2, 2021 at 3:47 PM dustymc notifications@github.com wrote:
- [EXTERNAL]*
https://obis.org/manual/contribute/ says they speak DWC so I don't think
anything would need "published," just tell them where the DWC files are (
http://ipt.vertnet.org:8080/ipt/).—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/3484#issuecomment-789279422,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBG3CV6JUQ5BQT45I6TTBVTJZANCNFSM4YPZSSJQ
.
Yes OBIS uses WoRMS for taxonomy!
As far as what they state about data quality, what about the freshwater specimens that are mixed in with our ocean specimens? Would it know not to take them?
what about the freshwater specimens
We have the same issue with terrestrial species sprinkled throughout our data.
There appear to be some terrestrial species in their data. I searched on
Cestoda and a significant number were flagged as terrestrial.
On Thu, Mar 4, 2021 at 11:29 AM Andrew Doll notifications@github.com
wrote:
- [EXTERNAL]*
what about the freshwater specimens
We have the same issue with terrestrial species sprinkled throughout our
data.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/ArctosDB/arctos/issues/3484#issuecomment-790831839,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ADQ7JBG6K53VGHBJXQJS4YLTB7GQ3ANCNFSM4YPZSSJQ
.
It looks like they may have the ability to filter out non-marine records. I see no issue with going forward if our records meet their criteria.
Hi! I'm the GBIF-US Node manager. I'm also the OBIS-USA node manager. A couple points regarding publishing to OBIS.
Happy to help as needed! Just let me know.
@acdoll Would you want to contribute marine birds and mammals as well as marine invertebrates?
Our Marine Invertebrate taxonomy should be in good condition as it uses valid WoRMS taxa.
@dustymc Does most of this happen automatically or would we (or you) have to do a lot of preparation to meet the OBIS standards?
@albenson-usgs It appears that only records with marine lat/longs are published. Does OBIS eliminate what isn't useable or do we need to do that?
@sharpphyl publishing to OBIS would not happen automatically and it's preferable for data providers to work with me to get missing elements added and subset out records that are not appropriate. The subset would need to be shared via the OBIS-USA IPT as the VertNet IPT is not registered with OBIS. Does that help?
So, this definitely looks like extra work to me and I am not sure that we have the resources right now to make it happen unless @dustymc has magic. We would need:
It is the first step that is problematic and seems like a lot of work to me. @albenson-usgs this is a prime example of why collections will fail to participate as we are being required to provide our data in some new form/subset. We already do this for GBIF/iDigBio, GGBN, GloBI. In addition, our data is all already publicly available in far more detail than any of these aggregators can publish. I'm not saying we won't eventually do this, but we really don't have resources to create new facets of our data for every possible use case and I am certain that all of the other CMS feel the same. Sorry if this sounds like whining - I just really think it is something that the larger community of users should be considering.
I agree with @Jegelewicz that it sounds like work that right now we don't have the resources to do. If multiple Arctos collections wanted to publish in OBIS, then it would make sense to expend the resources necessary. Perhaps a topic for the AWG to consider at some point. For just our relatively small collection (compared to the UF invertebrate collection), the time and effort would be greater than the benefit. We do already have significant visibility with GBIF and InvertEBase, etc. But it's good to know what would be required if we wanted to participate.
@Jegelewicz and @sharpphyl I do understand. It would not fall completely on you all to do the work necessary. I do have code (in R) already for the UF Invertebrate and KUBI collections that can be reused. However, I need someone to work with to ask questions of that will be responsive (e.g. "these are the species to be subsetted out- does that look right to you?"). The one benefit I could see from doing this via Arctos as opposed to going collection by collection is that I'm assuming the data are all formatted pretty much the same so we may be able to develop a workflow wherein any datasets that have a WoRMS identifier are identified. Any records from that dataset without a WoRMS identifier are subsetted out from the dataset then (maybe) I can have an "Arctos script" that those datasets with just those records and adds any missing columns/checks for points on land. If the data coming into that "Arctos script" are consistent in where the information is located and what's missing it should be an easy thing to re-run the script across multiple datasets. Does that make sense? But I need someone to ask questions when the script finds things that seem anomalous to me.
@albenson-usgs we talked about this yesterday in our taxonomy committee meeting. We are up for you testing, but it is difficult to make promises about how responsive we can be or who is best to answer questions. Anyway, if you are willing to deal with the possibility of waiting around for a response, we think we can go ahead. So, first question:
I do have code (in R) already for the UF Invertebrate and KUBI collections that can be reused.
But how will you access our data? We would prefer that you do that from the VertNet IPT. Possible? Then you could ask questions as they arise?
Yes, that's possible. I have some other data processing that I'm working through so I'm not exactly sure when I'll get to this but I will add it to my to do list :-) Can you point me at one particular dataset that would be good to start with? I only see a link to the entire VertNet IPT in this thread. Also I want to let @dbloom know that this is happening just for awareness.
The DMNS:Inv dataset would be a good test case. It is Arctos collection number 74. Let me know if that doesn't get you what you need!
It's this one: http://ipt.vertnet.org:8080/ipt/resource?r=dmns_inv, right? It's showing up as red in the IPT which I've never seen before. Does anyone know why?
@dustymc do you know what the above means?
Nope.
I'll check with Dave.
@albenson-usgs per my Slack message to you: this happens when resources on an automated publishing schedule fail to publish properly. I noticed that there were two datasets from Arctos that failed in the last publication event a few days ago. Both are direct dbase connections, so after checking on the IPT side I notified Dusty who remapped and reset Arctos for me (something that take several hours to complete - thank you Dusty) and now both resources have been republished and are working normally.
Most helpful comment
@albenson-usgs per my Slack message to you: this happens when resources on an automated publishing schedule fail to publish properly. I noticed that there were two datasets from Arctos that failed in the last publication event a few days ago. Both are direct dbase connections, so after checking on the IPT side I notified Dusty who remapped and reset Arctos for me (something that take several hours to complete - thank you Dusty) and now both resources have been republished and are working normally.