Ckeditor5: Handling variations of font-family notations and partial matches

Created on 16 Jan 2018  Â·  12Comments  Â·  Source: ckeditor/ckeditor5

When working on the initial implementation of the font family feature @jodator implemented a regexp-based solution which made the editor accept content like:

<span style="font-family: &quot;Ba ba&quot;, 'Bu bu'">
<span style='font-family: "Bo bo"'>

Such content cannot be produced by the editor but it may pasted from outside of the editor.

During the review we decided to ignore such cases for now. The reasoning is that the editor should only be able to handle data which it produced for now and the whole spectrum of mismatched font name values is much wider:

(source: https://github.com/ckeditor/ckeditor5-font/pull/5#issuecomment-357954222)

We only need to care about correctly loading data created with the editor. There's much more than quotes than can go wrong here and it's not yet time to worry about these situations. Some examples:

  • font was used instead of font-family,
  • some characters were encoded in font-family's value,
  • there's only a partial match for a font family name – e.g. only the first name was used.

All these cases may happen with a pasted content and a complete solution for them would be extremely complex. We've seen that in CKEditor 4 already where a bit too aggressive normalisation was actually breaking stuff rather than helping.

We need to understand which cases we should handle. Also, the "how" and "where" questions need to be answered.

font discussion 2 improvement

Most helpful comment

I just wasted like 15 minutes trying to figure out what I'm doing wrong that the font styles aren't preserved. I thought that maybe the nested spans are a problem or maybe I made a typo or sth. Then finally I found this issue.

The problem with unsupporting simple scenarios such as inline styles with font families without a space character after each comma is painful. It makes it impossible to reuse content created with CKEditor 4 using CKEditor 5:

<p><span style="font-family:Arial,Helvetica,sans-serif">dfgdfg</span></p>

It was pretty hard to understand WTF considering that CKEditor 5 allows you to configure styles without a comma

fontFamily: {
    options: [
        'default',
        'Arial,Helvetica,sans-serif',
        'Courier New,Courier,monospace'
    ]
},

Supporting the syntax that CKEditor 4 produced is a must have IMO. In other words, we don't have to solve all possible variations at once (e.g. try to resolve partial matching), let's just try to cover CKEditor 4 backwards compatibility first.

All 12 comments

there's only a partial match for a font family name – e.g. only the first name was used.

As for this I think that this is a case in which we have to decide if we want such behavior since it might be (or might be not) different values. Example:

<span style="font-family:Arial, sans-serif">
<span style="font-family:Arial">
<!-- OR -->
<span style="font-family:Helvetica, Arial, sans-serif">
<span style="font-family:Arial,sans-serif">
<!-- OR -->
<span style="font-family:Helvetica, Arial, sans-serif">
<span style="font-family:Arial, Helvetica, sans-serif">

All above pairs looks for me as different font-family settings.

<span style="font-family:Abril Fatface"> 
<span style="font-family:'Abril Fatface' ">
 <span style="font-family:'abril fatface' ">
 <span style="font-family:abril fatface">

The above 4 are also same. They should work as well.

On Gitter you wrote:

@Reinmar : Why Font plugin adding single quote in name of font family if I have space in it? For example Abril Fatface => 'Abril Fatface'

That's because the following is not correct AFAIR:

<span style="font-family:Abril Fatface">
<span style="font-family:abril fatface">

Font names must be wrapped with quotes if they contain spaces. Am I right @jodator? I think you've checked this in the spec in the past?

@Reinmar : Thanks for reply.

Then should the font plugin convert <span style="font-family:abril fatface"> into <span style="font-family:'abril fatface'"> or it should just remove it as this is not in the list of fonts.

@Reinmar yes, the font name that is multi-word that contains invalid characters must be wrapped in quoutes.

taken from MDN:

font-family: Gill Sans Extrabold, sans-serif;
font-family: "Goudy Bookletter 1911", sans-serif;

and from specs:

Font family names other than generic families must either be given quoted as strings, or unquoted as a sequence of one or more identifiers. This means most punctuation characters and digits at the start of each token must be escaped in unquoted font family names.

so probably we do overlooked something here as I remembered that quotes must be defined.

@gauravjain028: as for now if it does not match style="font-family:'abril fatface'" it will not be converted.

~@Reinmar we probably could fix that easily. The other question that I cannot find an answer is whether to make those font names case sensitive or not? The chrome looks like is case insensitive.~

Actually I've found that:

User agents must match these names case insensitively, using the "Default Caseless Matching" algorithm outlined in the Unicode specification [UNICODE].

OK, it's clear that we need some additional work in order to properly match font names which weren't generated by CKEditor 5 (e.g. were pasted from other websites). This will help in some cases where content created outside the editor is injected into the editor. However, it's not out main goal at the moment to make the editor work with any HTML created anywhere. Our main goal is to make the editor compatible with itself – if you created some content in the editor you should be able to load it back to the editor.

Also, this shouldn't be a problem for anyone that CKEditor 5 wraps font names in quotes. That's entirely correct and in terms of "data" it represents exactly the thing which the user wanted to do – it applies a specific font name.

I just wasted like 15 minutes trying to figure out what I'm doing wrong that the font styles aren't preserved. I thought that maybe the nested spans are a problem or maybe I made a typo or sth. Then finally I found this issue.

The problem with unsupporting simple scenarios such as inline styles with font families without a space character after each comma is painful. It makes it impossible to reuse content created with CKEditor 4 using CKEditor 5:

<p><span style="font-family:Arial,Helvetica,sans-serif">dfgdfg</span></p>

It was pretty hard to understand WTF considering that CKEditor 5 allows you to configure styles without a comma

fontFamily: {
    options: [
        'default',
        'Arial,Helvetica,sans-serif',
        'Courier New,Courier,monospace'
    ]
},

Supporting the syntax that CKEditor 4 produced is a must have IMO. In other words, we don't have to solve all possible variations at once (e.g. try to resolve partial matching), let's just try to cover CKEditor 4 backwards compatibility first.

Is this going anywhere? It would be great to interpret the basics of font size and type from other sources...

Could you please show where this parsing is done in the code? I would like to see if I can put together a hack that deals with a couple of my specific use cases.

@statler You can check the configuration of conversion for font here (by logging it): https://github.com/ckeditor/ckeditor5-font/blob/master/src/fontsize/fontsizeediting.js#L54. The converter definition is build here: https://github.com/ckeditor/ckeditor5-font/blob/master/src/fontsize/utils.js#L105 (for pixel sizes).

Might be a similar case: https://github.com/ckeditor/ckeditor5/issues/1389#issuecomment-443204557.

So depending on the output of other sources you might need to update how FontSize and FontFamily plugins parse and output content.

For font family it might be a bit harder due to issue mentioned in first comment.

there's only a partial match for a font family name – e.g. only the first name was used.

As for this I think that this is a case in which we have to decide if we want such behavior since it might be (or might be not) different values. Example:

<span style="font-family:Arial, sans-serif">
<span style="font-family:Arial">
<!-- OR -->
<span style="font-family:Helvetica, Arial, sans-serif">
<span style="font-family:Arial,sans-serif">
<!-- OR -->
<span style="font-family:Helvetica, Arial, sans-serif">
<span style="font-family:Arial, Helvetica, sans-serif">

All above pairs looks for me as different font-family settings.

Just to put my opinion, the last two are different, but the first should match, because the main desired font is the same, the others fonts are fallback fonts. At least this is how I see it.

To sum up what we can do here now:

  • No normalization on conversion/data processing. Whatever font-family value is e.g. pasted, it should be converted to exactly the same value in the model. This way, we'll ensure that we'll not break the content (which might happened and actually happened in CKE4)
  • The only point at which we can normalize the font value is when deciding which option in the UI should be marked as active. Here, we can have some naive heuristic.
Was this page helpful?
0 / 5 - 0 ratings