馃洜 For migration steps see this comment. 馃洜
Currently, marker conversion works in a way that a view element (HTML tag) is created at positions where a marker starts and where a marker ends. This solution creates an incorrect HTML (for example, a custom element is inserted in <tr> which accepts only <td>s and <th>s). This leads to two problems:
Data sanitization becomes problematic because various libraries that perform it may remove the custom tags that are in incorrect places. Similarly, data parsing may lead to errors.
Thus, I propose changing how we would convert markers. If a marker boundary is before or after a view container element, instead of creating "marker tag", we should add an attribute to that element. The attribute will mark that a marker starts/ends after/before that element.
Some other remarks:
data-marker-start, data-marker-end.<comment id="..."> instead of unifying it, for example: <ck-marker data-name="...">. I propose that when used with default marker conversion, markers will simply use their names and will be converted to <ck-marker data-name="markerName">.BTW. we also discussed using HTML comments instead of regular tags to show marker boundaries but we decided that data sanitizers might remove them as well.
Additionally, we could also get rid of using <$marker> tag, which is now used as an extra step in marker conversion. As a reminder, what we do right now is more-or-less that:
<p>Foo<comment></comment>Bar<comment></comment></p>
<paragraph>Foo<$marker></$marker>Bar<$marker></$marker></paragraph>
<paragraph>Foo[Bar]</paragraph>
We could directly use ck-marker element.
For backward compatibility, <comment> will still be converted to <ck-marker>.
- Those attributes will simply contain a marker name. If there will be multiple, we will separate them with a comma.
Can marker have more information than just the name? I'd expect you can preserve more information there, so I'm not sure this is a good format.
- I wonder whether we should mark already existing marker conversion helpers as deprecated. I think that there should be no need to use anything else other than a marker name to save the marker with the data.
:+1:
- Proposed attribute names for view container elements:
data-marker-start,data-marker-end.
I was a bit hesitant at fist whether this isn't leaking implementation specific aspects to the data, in which case that should be avoided. Markers are a mechanism in CKE5 and in that sense this should not be reflected in the data. However, "marker" is also a common word and a legit word in this case, so perhaps it's not a big deal.
However, if all features that use markers in the model will use the same attributes in the data, that will be actually leaking the implementation. If we'll decide to rewrite a feature X to not use markers, will we change the data format for it? If not, then how will a non-marker feature write to the same attribute that is used by marker-related converters?
It will also be quite impractical if all features will use the same marker. Simple data traversal will not let you find out where comments or suggestions are. You will need to parse attribute values for that.
Therefore, IMO, we should use:
data-comment-start, data-suggestion-start, etc.
Can marker have more information than just the name? I'd expect you can preserve more information there, so I'm not sure this is a good format.
No. Marker keeps all the data. We do keep additional data for comments and suggestions but this data is not sent together with a marker.
Therefore, IMO, we should use:
data-comment-start,data-suggestion-start, etc.
I think that my biggest point against is that if we use different attribute names we cannot make it a default converter. My idea was that we could deprecate the current helper and then not to introduce any and everything would just work.
You will need to parse attribute values for that.
I agree. OTOH, this will be an easy parse. And there can be multiple suggestions on one element, so you might still need some processing of the attribute.
Okay, so the final agreement is that we go with data-comment-start etc., not default converter. One of the additional arguments that popped up is that you might have some markers that you don't want to convert.
We also decided not to go with <ck-marker> instead, we will go with <markerType-start name="..."> and <markerType-end name="..."> to match data-markerType-start and data-markerType-end attributes. This means that there will be no default converter after all.
For example, for comments it will be:
<figure data-comment-start="commentId:uid" ...>...</td>
<comment-start name="commentId:uid"></comment-start>
Multiple comments:
<figure data-comment-start="commentId:uid,commentId2:uid" ...>...</td>
<comment-start name="commentId:uid"></comment-start><comment-start name="commentId2:uid"></comment-start>
While the marker name will still remain comment:commentId:uid
Another change: collapsed markers will have both tags: <markerType-start> and <markerType-end> in the data.
Some final changes:
We decided to go with four attributes for marker boundaries:
data-group-start-beforedata-group-start-afterdata-group-end-beforedata-group-end-afterThis was due to some scenarios that we forgot about earlier. For example:
<tr><td><p></p>[</td><td>]<p></p></td></tr>
In such case, only having data-group-start and data-group-end we weren't able to store such marker in the data.
Also, in the final solution, we don't check whether the marker boundary is after/before the view container element. Instead, we check in the model schema whether a text is allowed at the marker boundary position. If so, a tag is added there. If not, an attribute is added to the closest view element.
This prompted adding schema instance to conversionApi.
In this iteration, we introduced new, safer way to convert markers to data. DowncastHelpers#markerToData() and UpcastHelpers#dataToMarker() were introduced. DowncastHelpers#markerToElement() should not be used in the data pipeline (it is still okay to use in the editing pipeline). Also, we marked UpcastHelpers#elementToMarker() as deprecated (and it fires a warning on the console).
We should inform our users about how to cope with this change.
If our users used markerToElement() in the data pipeline of their features, the output generated by this converter will not be compatible with dataToMarker() conversion. It means that for some transition period, documents saved in users' databases will have "old" values. Which means that users would have to use elementToMarker() upcast, which is deprecated.
Fortunately, elementToMarker() upcast converter can be easily substituted with elementToElement() converter:
view part of the configuration object stays the same.model part of the configuration should return a model element with name $marker and attribute data-name equal to the marker name, that was earlier set in model property, or returned by the callback assigned to the model property.For transition period, users should introduce new markerToData() and dataToMarker() converters but also keep elementToElement() upcast converter for backward compatibility with not-yet-regenerated documents.
See the example below to better understand how to change the code.
Before:
editor.conversion.for( 'dataDowncast' ).markerToElement( { ... } );
editor.conversion.for( 'upcast' ).elementToMarker( {
view: oldViewConfiguration,
model: oldModelCallback
} );
After:
// Use the new helper for data downcast conversion.
// See the documentation to learn how to configure the new converter.
editor.conversion.for( 'dataDowncast' ).markerToData( { ... } );
// Backward compatibility with old documents:
editor.conversion.for( 'upcast' ).elementToElement( {
view: oldViewConfiguration,
model: ( viewElement, modelWriter ) => {
const markerName = oldModelCallback( viewElement );
return modelWriter.createElement( '$marker', { 'data-name': markerName } );
}
} );
// Use the new helper for upcast conversion.
// See the documentation to learn how to configure the new converter.
editor.conversion.for( 'upcast' ).dataToMarker( { ... } );
Note, that view property is not changed, while in model callback, instead of returning a marker name, you need to return $marker element with the marker name set as data-name property.
Other than that, the comma (`,`) character is now prohibited in marker names.
Instruction for anyone who is processing the editor data.
First, please get familiar with the description of marker-to-data downcast, where we described what kind of HTML may be generated.
If you processed suggestions or comments you have two possibilities:
Below are notes for comments. Suggestions are similar.
Earlier, you have been looking only for <comment> tags, in which id attribute kept the comment id (and sometimes "count" part, which was recently added with multi-cell comments). Also, there was type attribute to know whether the tag points to the beginning or to the end of the marker (however, start tag was always before end tag, so there might be no need to check type attribute).
Also, collapsed markers did not have closing tag, however, in case of comments or suggestions, collapsed markers did not exist (collapsed comment or suggestion was removed).
Now, type attribute got "included" in the tag name, so we have <comment-start> tag and <comment-end> tag. Also, collapsed markers have both tags now. Other than that, id attribute is no longer available, instead, whole marker name (except of comment: prefix) is set in name attribute. For comments, id and marker name were similar, for suggestions the suggestion marker name is more complex.
Earlier: <comment id="commentId:count" type="start">
Now: <comment-start name="commentId:count">
For suggestions:
Earlier: <suggestion id="e8ghd7:e390dk" suggestion-type="formatInline:i4u3xm" type="end" count="3erh">
Now: <suggestion-end name="formatInline:i4u3xm:e8ghd7:e390dk:3erh">
Since marker name for suggestions is complex, below is a function that breaks down marker name for suggestions into meaningful parts:
// `name` is the value from the name attribute.
parseSuggestionName( name ) {
// Name template is `<type>[:<subType>]:<id>:<authorId>[:<count>]`.
const parts = name.split( ':' );
return {
type: parts[ 0 ], // "insertion", "formatInline", etc.
subType: parts.length >= 4 ? parts[ 1 ] : null, // hash or null
id: parts.length < 4 ? parts[ 1 ] : parts[ 2 ], // unique id
authorId: parts.length < 4 ? parts[ 2 ] : parts[ 3 ], // user id
count: parts.length == 5 ? parts[ 4 ] : null // additional unique id
};
}
Again, marker count is an additional unique id for multi-range suggestions. If count is in the suggestion name, a subType is present as well. The opposite is not true.
Keep in mind that the marker name in the editor also contains "group prefix" (for example, marker name for comment would be comment:ek45nd:ep0v not just ek45nd:ep0v.
If this feels too complex, you can always get ids of existing suggestions from the editor and match them with parsed marker names. See track changes API and comments API. This might be even a more future-proof solution.
Other than new tag names, we also introduced marker attributes. These are used if a marker boundary was in a place where text nodes are disallowed. Quoting from the previously linked API docs, we have:
data-[group]-start-before="[name]", for example: data-comment-start-before="commentId",data-[group]-start-after="[name]", for example: data-comment-start-after="commentId",data-[group]-end-before="[name]", for example: data-comment-end-before="commentId",data-[group]-end-after="[name]", for example: data-comment-end-after="commentd".For example, the following HTML in the new format:
<td data-suggestion-end-after="formatInline:i4u3xm:e8ghd7:e390dk:3erh" data-suggestion-start-before="formatInline:i4u3xm:e8ghd7:e390dk:3erh">Foo</td>
Is equivalent to the following HTML in the old format:
<suggestion id="e8ghd7:e390dk" suggestion-type="formatInline:i4u3xm" type="start" count="3erh"></suggestion>
<td>Foo</td>
<suggestion id="e8ghd7:e390dk" suggestion-type="formatInline:i4u3xm" type="end" count="3erh"></suggestion>
start-before and end-after attributes are preferred, however, start-after and end-before are used when the former cannot be used (for example, a collapsed marker at the end of an element).
If there are multiple markers start/ending before/after given element, their names put one after another in marker attribute: data-[group]-start-before="[name],[name2],[name3]", for example: data-comment-start-before="commentId,commentId2:countId,commentId3" (in this example, the second comment is a multi-range comment).
Most helpful comment
Instruction for anyone who is processing the editor data.
First, please get familiar with the description of marker-to-data downcast, where we described what kind of HTML may be generated.
If you processed suggestions or comments you have two possibilities:
Below are notes for comments. Suggestions are similar.
Earlier, you have been looking only for
<comment>tags, in whichidattribute kept the comment id (and sometimes "count" part, which was recently added with multi-cell comments). Also, there wastypeattribute to know whether the tag points to the beginning or to the end of the marker (however, start tag was always before end tag, so there might be no need to checktypeattribute).Also, collapsed markers did not have closing tag, however, in case of comments or suggestions, collapsed markers did not exist (collapsed comment or suggestion was removed).
Now,
typeattribute got "included" in the tag name, so we have<comment-start>tag and<comment-end>tag. Also, collapsed markers have both tags now. Other than that,idattribute is no longer available, instead, whole marker name (except ofcomment:prefix) is set innameattribute. For comments, id and marker name were similar, for suggestions the suggestion marker name is more complex.Earlier:
<comment id="commentId:count" type="start">Now:
<comment-start name="commentId:count">For suggestions:
Earlier:
<suggestion id="e8ghd7:e390dk" suggestion-type="formatInline:i4u3xm" type="end" count="3erh">Now:
<suggestion-end name="formatInline:i4u3xm:e8ghd7:e390dk:3erh">Since marker name for suggestions is complex, below is a function that breaks down marker name for suggestions into meaningful parts:
Again, marker count is an additional unique id for multi-range suggestions. If
countis in the suggestion name, asubTypeis present as well. The opposite is not true.Keep in mind that the marker name in the editor also contains "group prefix" (for example, marker name for comment would be
comment:ek45nd:ep0vnot justek45nd:ep0v.If this feels too complex, you can always get ids of existing suggestions from the editor and match them with parsed marker names. See track changes API and comments API. This might be even a more future-proof solution.
Other than new tag names, we also introduced marker attributes. These are used if a marker boundary was in a place where text nodes are disallowed. Quoting from the previously linked API docs, we have:
data-[group]-start-before="[name]", for example:data-comment-start-before="commentId",data-[group]-start-after="[name]", for example:data-comment-start-after="commentId",data-[group]-end-before="[name]", for example:data-comment-end-before="commentId",data-[group]-end-after="[name]", for example:data-comment-end-after="commentd".For example, the following HTML in the new format:
Is equivalent to the following HTML in the old format:
start-beforeandend-afterattributes are preferred, however,start-afterandend-beforeare used when the former cannot be used (for example, a collapsed marker at the end of an element).If there are multiple markers start/ending before/after given element, their names put one after another in marker attribute:
data-[group]-start-before="[name],[name2],[name3]", for example:data-comment-start-before="commentId,commentId2:countId,commentId3"(in this example, the second comment is a multi-range comment).