_Original author: [email protected] (October 12, 2011 03:59:50)_
What steps will reproduce the problem?
Currently, it's not possible to re-apply a set of manipulations without exporting, messing with the export file, and re-importing. It would be extremely useful to provide repeatable transformation capabilities. Without these, the use of Refine for scientific (repeatable) applications is extremely limited.
_Original issue: http://code.google.com/p/google-refine/issues/detail?id=460_
_From thadguidry on October 12, 2011 13:27:59:_
There is a feature in Undo/Redo that you can use to Export operations to a JSON text file and then paste them in for another dataset and Apply them. This is shown in the tutorial videos. Does that suit your needs ? or is your request something more ?
_From tfmorris on October 12, 2011 20:48:48:_
It would be useful to understand the "messing" that you're trying to avoid and/or the UI flow for what you're proposing. Pretend we have no clue what context you're operating in or what your assumptions are and make it nice and simple.
_From [email protected] on October 13, 2011 19:49:43:_
Exporting and re-importing operations sounds promising, but a "re-import" command might be clearer to users.
_From [email protected] on September 26, 2012 06:53:53:_
I have the same issue. Let me summarize my workflow:
Now I want to use the 'cross()' function to do calculations on "foobar" relative to "foo" and "bar", as explained in
http://code.google.com/p/google-refine/wiki/GRELOtherFunctions
This is a very powerful tool for data manipulation, and it works great.
I now want to redo the process for a fresh dataset with the exact same data layout. If I export the JSON and apply to new projects, I am left with new project names "foo1", "bar1" and "fobar1". As the 'cross()' function, and presumably other functions too, depend on the name of the referenced projects, and hence it does not work well with the new names. It does even not play well with looking up cell contents from within the same project, as there is no parameter "project.name" available either.
The solution available to me at present is this:
While I can cope with this, being a wizard with regexp and understanding programming syntaxes quite well, it is not very handy, and is quite time consuming.
A simple "Reload data and replay all operations" function would solve this in a snap.
;)Frode
I completly agree with this issue.
My problem for example is to add new sheets to the project from the initial excel that I didn't include previously.
A simple "Edit/Change Project Dataset Configuration..." function is needed, don't you think?
Could a trigger as "re-load data into project" for applayng same project layout be a first step solution?
Someone is working on this direction?
I enpower my request becouse of I believe OpenRefine is really a powerfull and potentially essential tools for Enterprise Information Management expecially for all that concerns Open Data and Interoperability fields.
The problem is that to use it in an enterprise way is really essential to be able to reiterate transformation (layout) of a project programmatically (i.e. via API or cron scripts).
Let me know what do u think about this topic.
Busa ;-)
Because it makes sense to keep the old data I wrote a plugin that contains (amongst other stuff) allows you to execute history steps from other projects, or re-execute history steps for the current project. You can find the plugin and the manual on http://www.bits.vib.be/index.php/software-overview/openrefine
Cheers,
Herwig
Hi Herwig.
I already know and use your great extension to speed-up many of may tampleting task reusing from different Project.
It's really usefull but for my objective is a workaround. With your extension I can create new project with new data and then import the history of
trasformation from the old project. Unfortunatly this doesn't make me able create a system that programmatically refresh/reload data into an
exixting project and make it possibile to export new transformation results as a source of another system.
It's a step toward the solution ...but still not the solution.
Maybe we could start from this extension to extend or evolve the features if nothing will be done in the OpenRefine main project.
Thanx again.
Il 01/07/2013 12:51, Herwig Van Marck ha scritto:
Because it makes sense to keep the old data I wrote a plugin that contains (amongst other stuff) allows you to execute history steps from other
projects, or re-execute history steps for the current project. You can find the plugin and the manual on
http://www.bits.vib.be/index.php/software-overview/openrefineCheers,
Herwig
—
Reply to this email directly or view it on GitHub https://github.com/OpenRefine/OpenRefine/issues/460#issuecomment-20274795.
Any update on this issue? It's been 4 years...
@stevenqzhang The VIB-Bits plugin is what most folks use to solve this issue. In fact, we could probably close this issue, since the general use case is handled nicely by the plugin.
We might eventually want to have OpenRefine directly support these VIB-Bits functionality for
The VIB-Bits plugin doesn't quite cover it for me, as I'd like to make repeatable all the steps including the decisions made in initial data import about formats, columns to skip etc.
It would make most sense for the initial data load to appear like a normal operation in the edit history to be replayed, exported as JSON etc. Is there any interest in work towards this approach?
It would make most sense for the initial data load to appear like a normal operation in the edit history to be replayed, exported as JSON etc. Is there any interest in work towards this approach?
Yes it has been proposed before to add import metadata to the history. I think it would be an obvious move.
Here is an in-the-wild example of an OpenRefine workflow being shared, where the import settings are described externally:
https://github.com/OCLC-Developer-Network/WikidataHoldingsMatching
Most helpful comment
The VIB-Bits plugin doesn't quite cover it for me, as I'd like to make repeatable all the steps including the decisions made in initial data import about formats, columns to skip etc.
It would make most sense for the initial data load to appear like a normal operation in the edit history to be replayed, exported as JSON etc. Is there any interest in work towards this approach?