Openrefine: Derive default enter-separator from previous split-cells

Created on 15 Oct 2019  路  3Comments  路  Source: OpenRefine/OpenRefine

Is your feature request related to a problem or area of OpenRefine? Please describe.
I'm always frustrated when the "Join multi-valued cells" prompt suggests , to me, even though:

  1. I used a different symbol in the previous "Split multi-valued cells" operation, and
  2. the , is present at least in some of the still split cells.

Describe the solution you'd like
Would both operations be more convenient & less error-prone if OpenRefine:

a) derived the default character from 1. or at least,
b) not suggested a symbol for which 2. was true?

Describe alternatives you've considered
I'm not sure.

Additional context
Related to #1113 & #2139.

UI enhancement

Most helpful comment

Either using the last value as the default or being able to set a default value in preferences would be approaches that worked for me (I always have to change this, and it's always a pain!)

Resolving issue (2) is more problematic - I don't think we can predict what the user wants to do here (maybe they specifically do want to use that separator even though it is in the cells already) and checking all values in the column for a separator character is likely to lead to performance problems for large projects.

So my vote would be to put (2) aside for the moment and focus on solving (1)

All 3 comments

For 1., we could consider storing the last value used for the Join/Split operations in the preferences, and propose that as default value in both dialogs.

Either using the last value as the default or being able to set a default value in preferences would be approaches that worked for me (I always have to change this, and it's always a pain!)

Resolving issue (2) is more problematic - I don't think we can predict what the user wants to do here (maybe they specifically do want to use that separator even though it is in the cells already) and checking all values in the column for a separator character is likely to lead to performance problems for large projects.

So my vote would be to put (2) aside for the moment and focus on solving (1)

Given @ostephens' comment above let's say that point 2. is out of scope because heuristics are likely to be flaky. So this is closed by #2520.

Was this page helpful?
0 / 5 - 0 ratings