Is your feature request related to a problem or area of OpenRefine? Please describe.
The cross function can sometimes fail for reasons unknown. Troubleshooting the failure is difficult because there is a lack of reporting either in the console or through errors that can be stored in OpenRefine
Describe the solution you'd like
When the cross fails because of a specific issue (e.g. that the project or column referenced does not exist) then the error should be available to be stored in a cell
When the cross function fails to find a value we should consider the option of returning an error ("value not found in project X column Y") to be stored (instead of 'null') if the user requests it
See also #1950
I'd also like to see a snippet "cross lookup failed" or whatever in the console logs when cross("my_project", "my_column") fails. (this will help test/debug later on with new UI)
I don't think we should log this, this is likely to generate lots of redundant logging messages as the cross function is typically applied on all rows. Using the existing GREL error mechanism seems more natural.
I would aggregate >0 then throw error, but that's fine Antonin. No log then.
Also, seems Project renaming affects this as well...I renamed my project and then now cross() lookups fail even after restarting Refine. David, why did you hardcoded the tests so much and not use ProjectMetadata? He probably was rushing to get this to me at the time, lolol. I really needed cross() to add extra data on a large Freebase upload.
I agree with @wetneb that console logging is going to be too verbose here. I can see an argument for having a 'verbose' mode for the console log which enable us to push information for debugging when necessary - but that's much bigger scope than this particular issue
Just putting down some notes about this:
Having had a look at cross and ProjectJoin.getJoin my first thought is that the process of translating a Project Name string to a project ID should be separated out from ProjectJoin.getJoin. If the project ID lookup is done directly from cross this gives us better opportunity to report on any errors in the process. If ProjectJoin.getJoin accepts project IDs rather than Project Name strings, I think this will also resolve one of the issues noted by @wetneb in #1950 :
Moreover, the current design requires that both project names are unique in the workspace, whereas one would expect that only the target project (whose name appears in the invocation of the function) would need to be uniquely named.
Fixed by #1985
Most helpful comment
Just putting down some notes about this:
Having had a look at
crossandProjectJoin.getJoinmy first thought is that the process of translating a Project Name string to a project ID should be separated out fromProjectJoin.getJoin. If the project ID lookup is done directly fromcrossthis gives us better opportunity to report on any errors in the process. IfProjectJoin.getJoinaccepts project IDs rather than Project Name strings, I think this will also resolve one of the issues noted by @wetneb in #1950 :