Ckeditor5: Raw HTML embedding

Created on 2 Oct 2020  ·  21Comments  ·  Source: ckeditor/ckeditor5

📝 Provide a description of the new feature

An idea of the feature is the ability to paste any HTML markup to the editor that will be displayed as a widget in the editing area and the raw HTML in the output data.

View:

<p>Paragraph</p>
<div class="ck-widget">
<!-- HTML Markup -->
</div>

editor.getData()

<p>Paragraph</p>
<!-- HTML Markup -->

Most probably, in the view, we will generate a preview that the user will not be able to interact with.


If you'd like to see this feature implemented, add a 👍 reaction to this post.

ux feature

Most helpful comment

UI/UX proposal

I'm for the inline editing strategy, as suggested by @pkwasnik. It sounds simple and if we pull it off this will lay the groundwork and pave the way for the code snippet feature (editing in a black-box inside editor content but beyond engine's reach).

But first, let's see if there are some serious blockers in this approach.

Views

Adding and editing the embed

An afterthought: this widget needs a selection handler (like tables).

Previewing the raw content

UX enhancements related to sanitization

Risks

  • The toolbar icon may be a challenge. We already used both designs that would make sense so we need to come up with some new idea

All 21 comments

I think gathering usage ideas would be awesome. The first thing that came to my mind are:

  • Some HTML snippets taken from other sites. Sometiemes, you can embed a piece of code - similar to MediaEmbedd but you could paste any content. I can think only of YouTube's "embed" in the share menu as an example (much better would be media embed for this) but a similar "embed on your webpage" is also available from other sites.
  • Google AdWords snippets (ad boxes). It _might_ be useful to allow users to insert an ad box inside an article.

This rise a question - should we allow JS in there? If yes - document it as a potential security risk.

This rise a question - should we allow JS in there? If yes - document it as a potential security risk.

The same could happen with iframe. However, the script tag is more dangerous I guess. So the question is, should we filter (somehow) the pasted HTML?

Much sites should be able to be embedded using OEmbed or services like iframely → https://iframely.com/embed/https%3A%2F%2Fblog.comandeer.pl%2F

Additionally nearly every kind of embed uses JS in some form (see the example above), so stripping JS out of the pasted content could make the whole thing pretty useless. At the same time keeping JS can be dangerous…

Notes from my sync with @pomek:

  • Now: Handling <div class="raw-html-embed">.

    • Figure out a good class name.

    • Do we need a <figure>?



      • If someone wants to embed an iframe with a video, then perhaps we could.


      • If a script tag, then definitely not...


      • So, in case of a so low-level feature as this one, it's probably best to leave this to the user (which, in many cases, will be some developer). Let's keep in mind that this is a feature targetted at slightly more advanced users.



  • Future: Handling any element without a wrapper in the data.

    • This could work based on HTML comments or some function-based filter.

  • XSS:

    • Consider using an HTML sanitizer.


    • If we won't be able to use an HTML sanitizer for any reason, then:



      • Strip on* attributes.


      • Strip <script> tags.


      • Secure javascript: links, iframe src, image src, etc.



    • Recommend CSP settings in the feature guide. Explain that this feature allows embedding any HTML and it's not possible to fully ensure that the user didn't paste something unsafe. Therefore, this feature should be used with caution. OTOH, if we'll use a sanitizer and recommend CSP, then maybe we don't need to warn anyone further.

    • Add a config option noPreviews that'd solve this issue fully.

    • Consider adding a note in the embed dialog saying that you should not paste HTML from untrusted sources.

  • Should we use a widget mask to block user interaction?
  • TODO:

    • UI 💎

How do we want to call the new package?

  • ckeditor5-html-embed (similar to ckeditor5-media-embed)
  • ckeditor5-raw-html
  • ckeditor5-html-snippet

I don't know.

ckeditor5-html-embed (similar to ckeditor5-media-embed)

:+1:

Docs: We must mention clearly and visibly that this is a special feature.

Proposition of the UI: after clicking the button, put textarea in the editor right away. There will be two tabs: "Raw HTML" and "Preview" (ofc icons will looks better), so user could edit HTML inside an editor and quickly see preview.

First results of the conversion. It works.

image

However, there are my 3 cents:

  • When upcasting an element, there is no DOM reference to the upcasted element (I wanted to use element.innerHTML). Instead, I used view.stringify() on element's children.

  • While the conversion process, the results of view.stringify() and view.parse() cannot be used in the DowncastWriter.insert() function because the document fragment (returned by the parse() function) contains view.Elements that cannot be inserted to other elements.

  • RawElement cannot be passed as an argument to toWidget() function so I needed to wrap it in additional container.

  • Since the above, I used the container element for marking the editing data as a widget, and the raw element for embedding the HTML. https://github.com/ckeditor/ckeditor5/blob/eebe876cb59d00738cd6293cc7293afaca0414e2/packages/ckeditor5-html-embed/src/htmlembedediting.js#L72-L87

I guess the concept of the UI is required in order to continue work on the feature.

A question. sanitize-html allows customizing its behavior by specifying the options object as the second argument.

We can provide a default configuration that the end-users could modify in order to match their needs.

The question is – should we?

The question is – should we?

That depends on whether you see a use case for that. And it's easier to start with no option, that's for sure. Plus, we will be able to change the sanitizer with time if it's our internal thing.

  • Since the above, I used the container element for marking the editing data as a widget, and the raw element for embedding the HTML.

:+1:

  • When upcasting an element, there is no DOM reference to the upcasted element (I wanted to use element.innerHTML). Instead, I used view.stringify() on element's children.

You can use an instance of HtmlDataProcessor to turn a view structure into a string.

A question. sanitize-html allows customizing its behavior by specifying the options object as the second argument.

We discussed this f2f and the problem is that the default options of sanitize-html strip, among others, iframes and inline styles.

It is highly configurable, though. Same for DOMPurify – it also strips iframes by default, although keeps inline styles.

In our case, stripping inline styles, iframes, video elements, etc. would make the preview feature quite useless. Therefore, we'd need to loosen the default settings. However, if we change anything we risk allowing for a bit too much. So, we need to be very careful with that and test the final solution ourselves.

Things that should be previewable to make the preview useful:

  • iframe, video, audio elements
  • most HTML elements (via a whitelist)

    • including: spans, divs, section, article, aside, tables, images, etc. – everything that's a reasonable content

    • excluding: script, style, link, etc.

  • SVG elements?
  • inline styles
  • what else?

Things that should not be allowed (other than obvious things):

  • the style tag, all things that would normally be used in <head>,
  • what else?

Let's try to configure sanitize-html and DOMPurify to these requirements and test both with payloads from some XSS database.

Should we accept the controls attribute for video/audio tags?

image

What should we do with form and input elements?

<form action="/my-handling-form-page" method="post">
 <ul>
  <li>
    <label for="name">Name:</label>
    <input type="text" id="name" name="user_name">
  </li>
  <li>
    <label for="mail">E-mail:</label>
    <input type="email" id="mail" name="user_email">
  </li>
  <li>
    <label for="msg">Message:</label>
    <textarea id="msg" name="user_message"></textarea>
  </li>
 </ul>
</form>

ATM, I would prefer to do the same that the sanitizer does (remove):

image

UI/UX proposal

I'm for the inline editing strategy, as suggested by @pkwasnik. It sounds simple and if we pull it off this will lay the groundwork and pave the way for the code snippet feature (editing in a black-box inside editor content but beyond engine's reach).

But first, let's see if there are some serious blockers in this approach.

Views

Adding and editing the embed

An afterthought: this widget needs a selection handler (like tables).

Previewing the raw content

UX enhancements related to sanitization

Risks

  • The toolbar icon may be a challenge. We already used both designs that would make sense so we need to come up with some new idea

I guess we need to have this issue about excluding DOM events in widgets resolved to avoid surprises in the <textarea>.

  • Does it address issues like using Ctrl+A in <textarea>?

@psmyrek, did you test how a textarea works within the editor when working on #4600?

  • The toolbar icon may be a challenge. We already used both designs that would make sense so we need to come up with some new idea

This feature is targetted at more advanced users, so using the word "HTML" would be fine. Would it somehow fit in the icon?

so using the word "HTML" would be fine

I'm not so sure about this. They could use PHP there. Or virtually anything for that matter (e.g. markdown?). So using "HTML" is not the best decision IMO (not very future-proof) although people may get a general idea.

Would it somehow fit in the icon?

Yes, but this is going to be ugly.

In order to have a textarea and the toggle button inside the widget, we must use changes from this PR – https://github.com/ckeditor/ckeditor5/pull/8243.

I prepared final designs for the UI of the feature. Please let me know if anything is unclear.

Textarea blurred

Textarea focused

Content preview

Spacings

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jodator picture jodator  ·  3Comments

benjismith picture benjismith  ·  3Comments

MCMicS picture MCMicS  ·  3Comments

pjasiun picture pjasiun  ·  3Comments

Reinmar picture Reinmar  ·  3Comments