Ckeditor5: Japanese input (IME) fails at the first character in a paragraph

Created on 29 Dec 2017  ยท  29Comments  ยท  Source: ckeditor/ckeditor5

๐Ÿž Is this a bug report or feature request? (choose one)

  • Bug report

๐Ÿ’ป Version of CKEditor

ver1.0.0-alpha.2, classic, balloon and inline build

๐Ÿ“‹ Steps to reproduce

In macOS environment:

  1. Add Japanese Input(open โ€œKeyboard Preferencesโ€ from โ€œLanguage & Regionโ€ panel)
  2. Open https://ckeditor.com/ckeditor-5-builds/#classic
  3. Focus the cursor at the first character of a edit area(left and top).
  4. Type "moji" in Japanse mode.

โœ… Expected result

Editor displays "ใ‚‚ใ˜"

โŽ Actual result

Editor displays "mใŠใ˜"

๐Ÿ“ƒ Other details that might be useful

In Japanese, we use special input method for inputting our language. Other asian region like Chinese or Korea also uses their special methods.

Typically, we Japanese type some keystrokes which express Japanese pronunciation, then press space key some times to convert these keystrokes into proper Japanese characters.

bug

Most helpful comment

I tested v11.0.1 with some Japanese IME on macOS 10.13.6.
The problem I stated above didn't occur.

  • macOS Standard Japanese Input: ๐Ÿ™†
  • Google IME: ๐Ÿ™†
  • Aqua SKK(my favorite): ๐Ÿ™†

Excellent!

All 29 comments

I've found more accurate steps. When a paragraph includes only a <br> like...

<p>
  <br data-cke-filler="true">
</p>

In that paragraph, Japanese input doesn't work properly. The condition I mentioned before is not enough to reproduce.

Thanks for a detailed bug report. We're aware of problems with text composition and spent some time in the past researching this topic. Unfortunately, we haven't finished https://github.com/ckeditor/ckeditor5-typing/issues/4 and now it will need to wait.

My guess here is that we remove the bogus br after the first character is inserted which may break the composition. The solution is the same as in other cases โ€“ to not render when composition takes place. But we need to figure out whether this is really a DOM structure change what breaks composition or just selection re-rendering. Perhaps it's the latter in which case we could remove the bogus br normally and only prevent selection rendering (as in https://github.com/ckeditor/ckeditor5-engine/pull/861).

Thanks you for quick response. I'll watch https://github.com/ckeditor/ckeditor5-typing/issues/4 and if something I can help you about the composition problem, I'll willingly take effort.

OS: macOS 10.13.2
Browser: Chrome Version 63.0.3239.132 (Official Build) (64-bit)
CKEditor 5: 1.0.0-alpha.2

I made a quick investigation.

But we need to figure out whether this is really a DOM structure change what breaks composition or just selection re-rendering.

I found this is just selection re-rendering because _domSelectionNeedsUpdate returns false.

https://github.com/ckeditor/ckeditor5-engine/blob/v1.0.0-alpha.2/src/view/renderer.js#L611

And why is it? The answer is two selections have different props.

  • this.selection.anchor has Position {parent: Text, offset: 1}
  • oldViewSelection.anchor has Position {parent: Text, offset: 0}

I don't know why the two differs.Do you have any ideas about this?

By the way, DOM Selection seems to show different behavior whether you access the HTML file via file:// or http://. I consumed few hours because of that.

By the way, DOM Selector seems to show different behavior whether you access the HTML file via file:// or http://. I consumed few hours because of that.

Yep... Working with local files is not a good idea. Random things may work slightly differently, sometimes for no good reasons.

I don't know why the two differs.Do you have any ideas about this?

It's hard to tell because:

  • I'd need to understand when exactly this happens.
  • We'd need to make sure that you're checking this in a safe, unintrusive way.

Regarding "when exactly", does it happen right after typing the "m" letter? If so, then indeed one would expect that the DOM selection be after the typed "m" letter so on offset 1. But, if we touched the DOM first (e.g. we remove the <br>), we might've broken the selection by this and it might've been reset to the beginning of the block. Also, we have some problems with re-rendering text nodes. We should use the existing text node ("m") in this case but perhaps we remove it and create our own. This will reset the DOM selection to position 0 too.

So you'd need to check these things. What exactly happens here: https://github.com/ckeditor/ckeditor5-engine/blob/v1.0.0-alpha.2/src/view/renderer.js#L198-L212

And regarding checking/debugging stuff in a safe way โ€“ a couple of tips:

  • don't use the debugger โ€“ the debugger ,
  • use console logs (they are least obtrusive, "least" because of Edge, where even just opening the console changes the browser behaviour... SIC!),
  • log primitive or safe values, not whole objects โ€“ object change by reference so after logging them out they may be changed and you'll see the updated value in the console (popular mistake); so, e.g. log el.outerHTML instead of the el itself. Don't log whole positions. Log position.parent and position.offset. Etc.

Wow. That's surprising! I'll use JSON.stringify from today.

Okay, I made a more precise exploration.

patch

https://github.com/tai2/ckeditor5-engine/commit/348c5d261696a840c32745a7625b7194e39dd797

steps

  1. Set IME to Hiragana
  2. Put cursor to an empty edit area
  3. Put keystrokes: "mo"

Expected result: "ใ‚‚"
Actual result: "mใŠ"

logs

keyobserver.js:28 onDomEvent keyCode=229 altKey=false ctrlKey=false shiftKey=false
renderer.js:178 ---- enter render ----
renderer.js:200 this.markedTexts []
renderer.js:201 this.markedAttributes []
renderer.js:202 this.markedChildren [{"name":"p","_attrs":{},"_children":[{"_data":"m"}],"_classes":{},"_styles":{},"_customProperties":{}}]
renderer.js:650 this.selection.anchor {"parent":{"_data":"m"},"offset":1}
renderer.js:651 oldViewSelection.anchor {"parent":{"_data":"m"},"offset":0}
renderer.js:621 DOM selection needs updating!
renderer.js:246 ---- exit render ----
renderer.js:178 ---- enter render ----
renderer.js:200 this.markedTexts []
renderer.js:201 this.markedAttributes []
renderer.js:202 this.markedChildren []
renderer.js:650 this.selection.anchor {"parent":{"_data":"m"},"offset":1}
renderer.js:651 oldViewSelection.anchor {"parent":{"_data":"m"},"offset":1}
renderer.js:246 ---- exit render ----
keyobserver.js:28 onDomEvent keyCode=229 altKey=false ctrlKey=false shiftKey=false
renderer.js:178 ---- enter render ----
renderer.js:200 this.markedTexts [{"_data":"mใŠ"}]
renderer.js:201 this.markedAttributes []
renderer.js:202 this.markedChildren [{"name":"p","_attrs":{},"_children":[{"_data":"mใŠ"}],"_classes":{},"_styles":{},"_customProperties":{}}]
renderer.js:650 this.selection.anchor {"parent":{"_data":"mใŠ"},"offset":1}
renderer.js:651 oldViewSelection.anchor {"parent":{"_data":"mใŠ"},"offset":1}
renderer.js:246 ---- exit render ----
renderer.js:178 ---- enter render ----
renderer.js:200 this.markedTexts []
renderer.js:201 this.markedAttributes []
renderer.js:202 this.markedChildren []
renderer.js:650 this.selection.anchor {"parent":{"_data":"mใŠ"},"offset":1}
renderer.js:651 oldViewSelection.anchor {"parent":{"_data":"mใŠ"},"offset":1}
renderer.js:246 ---- exit render ----
renderer.js:178 ---- enter render ----
renderer.js:200 this.markedTexts []
renderer.js:201 this.markedAttributes []
renderer.js:202 this.markedChildren []
renderer.js:650 this.selection.anchor {"parent":{"_data":"mใŠ"},"offset":2}
renderer.js:651 oldViewSelection.anchor {"parent":{"_data":"mใŠ"},"offset":2}
renderer.js:246 ---- exit render ----
keyobserver.js:28 onDomEvent keyCode=77 altKey=false ctrlKey=false shiftKey=false
keyobserver.js:28 onDomEvent keyCode=79 altKey=false ctrlKey=false shiftKey=false

Note:

keyCode 77 = KeyM
keyCode 79 = KeyO
keyCode 229 = ?? I don't know

What I did is really just to push 'm' then 'o'.

For more concrete understanding, I made a screen capture.
http://tai2.net/misc/moji.mov

You might need to handle key code 229 properly.

see: https://github.com/select2/select2/issues/2482#issuecomment-46783362

It's not about handling 229 button. We don't handle buttons because typing is about more than keydowns. In simple cases (American) there's indeed 1:1 mapping between keys and inserted letters, but in most other locations this gets tricky. E.g. in Polish Alt+A will insert "ฤ…", in Spanish pressing ' and then A will insert "รก" and different quotes characters will be inserted. And so on. The extreme here is the IME where the actual text, as you know, is being composed for a longer time, you have the entire widget with suggestions, ability to move left/right, change words, etc.

Therefore, since information about keys is not sufficient, we let the browser insert text and, based on mutations, discover what was inserted into the editor.

Using this information, we update the model. We need to apply there minimal changes through operations. So e.g. When the user has such a selection: <p>x[y]z</p> and presses f we apply to operations:

  • delete 1 character at offset 1,
  • insert character "f" at offset 1.

Then, we convert these changes to the view. The view is still in the <p>x[y]z</p> state. So we update it to <p>xf[]z</p>.

Finally, we check whether we need to render anything. Since the user typed letter "f" and the browser inserted it into the DOM (which resulted in mutations from which we started), the DOM also contains <p>xf[]z</p> already, so we don't have to touch it (neither the structure nor the selection). We touch the DOM only if needed.

Thanks to the fact that we don't touch the DOM if not needed we don't break the IME. At least โ€“ we shouldn't if everything worked fine.

E.g., if typing "f" should start a composition, then if we would either change the selection or the DOM, even if the final structure and selection would be identical, the IME would be reset.

I'll try to analyse your logs later on today.

Thanks. Please tell me if you need more detailed logs.

OK, this bit is enough:

keyobserver.js:28 onDomEvent keyCode=229 altKey=false ctrlKey=false shiftKey=false
renderer.js:178 ---- enter render ----
renderer.js:200 this.markedTexts []
renderer.js:201 this.markedAttributes []
renderer.js:202 this.markedChildren [{"name":"p","_attrs":{},"_children":[{"_data":"m"}],"_classes":{},"_styles":{},"_customProperties":{}}]
renderer.js:650 this.selection.anchor {"parent":{"_data":"m"},"offset":1}
renderer.js:651 oldViewSelection.anchor {"parent":{"_data":"m"},"offset":0}
renderer.js:621 DOM selection needs updating!
renderer.js:246 ---- exit render ----

The first rendering touches <p> children. We've got there:

  • in the view: <p>m</p>
  • in the DOM (I guess): <p>m<br></p>

Since <p> is not empty anymore, DOM converter will convert the <p>m</p> view structure to <p>m</p> DOM structure so we get this (in _updateChildren()):

  • expected: <p>m</p>,
  • actual: <p>m<br></p>

So, the _updateChildren() method removes that <br>. It may also (unfortunately) remove the original "m" text node and insert another one.

One of these (removing the <br> or replacing the "m" text node resets the selection to:

<p>[]m</p>

This is corrected right after because the expected selection offset is 1, and actual is 0. This breaks the composition.

===

What you could check next is what exactly happens in _updateChildren(). Is the "m" text node touched or not?

Something differs. It doesn't seem to be touched.

To make the situation clearer, I bring the logs after browser reload.

patch

https://github.com/tai2/ckeditor5-engine/commit/ebb906014665114302f16096536480b6a4c17a55

Reload browser

renderer.js:178 ---- enter render ----
renderer.js:200 this.markedTexts []
renderer.js:201 this.markedAttributes []
renderer.js:202 this.markedChildren [{"name":"div","_attrs":{},"_children":[],"_classes":{},"_styles":{},"_customProperties":{},"isReadOnly":false,"isFocused":false}]
renderer.js:495 _updateChildren [] ["<br data-cke-filler="true">"] ["insert"]
renderer.js:246 ---- exit render ----
renderer.js:178 ---- enter render ----
renderer.js:200 this.markedTexts []
renderer.js:201 this.markedAttributes []
renderer.js:202 this.markedChildren [{"name":"div","_attrs":{},"_children":[{"name":"p","_attrs":{},"_children":[],"_classes":{},"_styles":{},"_customProperties":{}}],"_classes":{},"_styles":{},"_customProperties":{},"isReadOnly":false,"isFocused":false}]
renderer.js:495 _updateChildren ["<br data-cke-filler="true">"] ["<p><br data-cke-filler="true"></p>"] ["insert","delete"]
renderer.js:246 ---- exit render ----

Focus to editor

renderer.js:178 ---- enter render ----
renderer.js:200 this.markedTexts []
renderer.js:201 this.markedAttributes []
renderer.js:202 this.markedChildren []
renderer.js:656 this.selection.anchor {"parent":{"name":"p","_attrs":{},"_children":[],"_classes":{},"_styles":{},"_customProperties":{}},"offset":0}
renderer.js:657 oldViewSelection.anchor {"parent":{"name":"p","_attrs":{},"_children":[],"_classes":{},"_styles":{},"_customProperties":{}},"offset":0}
renderer.js:246 ---- exit render ----

Press m

keyobserver.js:28 onDomEvent keyCode=229 altKey=false ctrlKey=false shiftKey=false
renderer.js:178 ---- enter render ----
renderer.js:200 this.markedTexts []
renderer.js:201 this.markedAttributes []
renderer.js:202 this.markedChildren [{"name":"p","_attrs":{},"_children":[{"_data":"m"}],"_classes":{},"_styles":{},"_customProperties":{}}]
renderer.js:495 _updateChildren ["m"] ["m"] ["equal"]
renderer.js:656 this.selection.anchor {"parent":{"_data":"m"},"offset":1}
renderer.js:657 oldViewSelection.anchor {"parent":{"_data":"m"},"offset":0}
renderer.js:627 DOM selection needs updating!
renderer.js:246 ---- exit render ----
renderer.js:178 ---- enter render ----
renderer.js:200 this.markedTexts []
renderer.js:201 this.markedAttributes []
renderer.js:202 this.markedChildren []
renderer.js:656 this.selection.anchor {"parent":{"_data":"m"},"offset":1}
renderer.js:657 oldViewSelection.anchor {"parent":{"_data":"m"},"offset":1}
renderer.js:246 ---- exit render ----
keyobserver.js:28 onDomEvent keyCode=77 altKey=false ctrlKey=false shiftKey=false

There is a difference on DOM selection between English and Japanese mode.

You can see the difference here: https://codepen.io/tai2/pen/qxdwMX

After press m key:

English Input Mode

screen shot 2018-02-01 at 17 41 38

<div>m[]</div>

anchor offset: 1
focus offset: 1

Japanse Input Mode

screen shot 2018-02-01 at 17 41 50

Please note the underline indicating selection range

<div>[m]</div>

anchor offset: 0
focus offset: 1

This difference results in DOM selection retouching in Japanese environment.

I dug the CKEditor5 code deeply and realized why this retouching is occurred.
I will summarize the result in the next post.

Summary

When users press m in Japanese IME, internal model becomes <p>m[]</p> but actual DOM state becomes <p>[m]</p>.
So DOM selection state is retouched. It interrupts IME conversion.

What happens

initial model: <p>[]</p>
initial DOM: <p>[]</p>

  1. A user presses m. DOM state is changed to <p>[m]</p> which means anchorOffset is 0 and focusOffset is 1.
  2. MutationObserver._onMutations is invoked.
  3. 'mutations' event is fired.
  4. 'mutation' invokes typing plugin then Input.handle method is invoked. The latter branch is executed because it has a single text node change.
  5. _handleTextNodeInsertion is executed because mutation.type is 'children'.
  6. It fires 'input' event.
  7. It invokes InputCommand.execute with text and range properties specified.
    On this context, range is collapsed because it's constructed with single argument.
  8. It invokes model/Selection.setCollapsedAt which sets Renderer's this.selection.anchor.offset and this.selection.anchor.offset to 1 both.
    Model state is now <p>m[]</p>.
  9. Mutations triggers a rendering. In rendering process, _domSelectionNeedsUpdate() returns true. because of the model and DOM comparison.
  10. Rendering results in DOM retaching. DOM state is now <p>m[]</p>. (If you commented out this line, IME composition won't fail.)
  11. IME composition is finished forcefully.

Extra

I think, to fix this problem, MutationHandler. _handleTextNodeInsertion needs to take viewSelection at the second argument but currently it only uses mutation.

If you decided to pass viewSelection to it, another problem was here.

  1. in _onMutations, it calls DomConverter.domPositionToView where anchorNode is m, anchorOffset is 0, focusNode is m and focusOffset is 1.
    Model state is <p></p> so far because 'mutations' event is not fired yet.
  2. DomConverter.findCorrespondingViewText is invoked because domParent is a TextNode('m').
  3. It uses the parent node because sibling doesn't exist. viewElement.getChild(0) returns undefined because of the model state. So findCorrespondingViewText returns null then domPositionToView also returns null.
  4. So viewSelection stays initial state null. This is because model state is not updated yet at this point. But to execute 'mutations' needs viewSelection as well. It looks cyclic dependency to me.

I hope this report helps you fix the problem.

Nice analysis of this issue @tai2!

One strange thing I noticed is that I'm not able to reproduce this issue in the exact same manner. While typing moji using Hiragana keyboard inside empty heading it seems to work fine (see gif below). However, if I additionally add empty strong element there (by using bold command) it behaves the same as described in this issue:
mar-02-2018 15-49-22

The thing I noticed while watching attached http://tai2.net/misc/moji.mov is that mine IME dialog looks quite different and I wonder if that may be the case (some slightly different behaviour maybe):
screen shot 2018-03-02 at 15 49 41

screen shot 2018-03-02 at 15 56 13

Also the fact which @tai2 mentioned that blocking selection resolves the issue:

  1. Rendering results in DOM retaching. DOM state is now <p>m[]</p>. (If you commented out this line, IME composition won't fail.)

does not solve the issue I my case, which may mean that these both are separate issues. In my case, blocking children rerendering during composition solves it.

Anyway, if selection is the case in this one, it should be solved with https://github.com/ckeditor/ckeditor5-engine/issues/460 (if not there are also tickets for other cases like elements rerendering - see https://github.com/ckeditor/ckeditor5/issues/802#issuecomment-369197598).

@f1ames Apple introduced completely different Japanese IME on 10.10 Yosemite. If you use 10.9 or former, it might behave different. I tried Google Japanese IME Hiragana mode and checked the problem is reproduced as well.

I noticed the video, which was made by one of colleagues, was captured in Google Japanese environment.

The situation I described above occurs only on Chrome.
On Safari, the situation looks same to the one @f1ames described.
And on Firefox, it looks alike but a bit different.

Thanks @tai2, I will take a look at Google Japanese IME. Btw. could you share any insight why some people prefer Google IME over native one? Is it more convenient, have more functionality or is it just a matter of pure preferences?

As for software version I tested on:
OS: 10.11.6
Browser: Chrome Version 64.0.3282.167 (Official Build) (64-bit)
CKEditor 5: Latest master (the closest release is 1.0.0-alpha.2)

Btw. could you share any insight why some people prefer Google IME over native one?

Because it provides cleverer word suggestions than others based on their machine learning.

Hello @tai2,
sorry for the delay, but we were working on rendering improvements (https://github.com/ckeditor/ckeditor5-engine/pull/1424 and https://github.com/ckeditor/ckeditor5-engine/pull/1407) which addresses some IME issues.
The issue I was able to reproduce looking at your description is fixed now, but as it is not exactly the same as the one you reported I would like to ask you to try reproduce the original one on the latest version (which, by the way, was released yesterday ๐ŸŽ‰ ).

I tested v11.0.1 with some Japanese IME on macOS 10.13.6.
The problem I stated above didn't occur.

  • macOS Standard Japanese Input: ๐Ÿ™†
  • Google IME: ๐Ÿ™†
  • Aqua SKK(my favorite): ๐Ÿ™†

Excellent!

@tai2 thank you for rechecking this issue. Great news! ๐Ÿ‘

2019-11-15 16 47 54

the bug is still there.

Safari Version 13.0.3 (14608.3.10.10.1)
macOS Mojave 10.14.6 (18G95)

It may be a regression. Could you report a new ticket?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Reinmar picture Reinmar  ยท  3Comments

msamsel picture msamsel  ยท  3Comments

devaptas picture devaptas  ยท  3Comments

jodator picture jodator  ยท  3Comments

oleq picture oleq  ยท  3Comments