Ckeditor5: Handle image pasted/dropped from various sources (external websites, base64-encoded, etc.)

Created on 20 Jun 2018  路  12Comments  路  Source: ckeditor/ckeditor5

When copy-pasting a local image into the editor, it will be uploaded as expected.

However, if you copy an image from another website and paste it into the editor, it will use the previous url as the source, rather than uploading it.

Is this the intended behaviour? This can sometimes cause issues with regards to hotlinking, dead links, security, etc.

image feature

Most helpful comment

Interestingly, for issue https://github.com/ckeditor/ckeditor5-upload/issues/5 , the upload works as expected with current version of ckeditor5 (at least on Windows Chrome/Edge from MSPaint).

For my use case at least, I wish to have all images uploaded because:

  • They will need to be synced to other devices and accessible offline
  • Images from free hosting providers often go dead, hit usage limits, prevent hotlinking, etc
  • Including images from non-secure websites shows warnings
  • Data-uri images slow down API calls, and sometimes fail to save if HTML output is too big to fit in database.

Most cases are handled by the current functionality (+ my PR). I will probably filter out any other cases with userland code.

For configuration, maybe it would be useful to let users provide a function, as it could get fairly complex with flags for every use case. For example:

import upload // promise, takes fileData, resolves to upload dest
import uploadFromUrl // promise, takes url resolves to upload dest
import convertDataURI // takes datauri, converts to blob data

// config - upload everything
function handleImage(type, data, url) {
  switch (type) {
    case 'file':
      return upload(data);
    case 'data-img':
      return data ? upload(data) : upload(convertDataURI(url));
    case 'url':
      if (url.match(/^https:\/\/mydomain.com/) return url;
      else return uploadFromUrl(url);
    case 'webkit-fake-image':
      throw new Error('Image upload not supported in Safari.')
  }
}

// config - no uploads, only include small data-img
function handleImage(type, data, url) {
  switch (type) {
    case 'file':
      return null;
    case 'data-img':
      if (url.length < 10000) return url;
      else return null;
    case 'url':
      return url;
  }
}

All 12 comments

There are multiple possible scenarios what we may get when someone pastes/drops an image from another website or application. It will depend on how someone copied that content and on the browser. We may get:

  • HTML with base64-encoded images,
  • HTML with images which src attributes point to an external website,
  • HTML with images which src are references to files in the clipboard's dataTransfer.files,
  • just images (as dataTransfer.files),
  • and I'm sure I missed something.

I haven't checked what browsers really do, so these are all theoretical scenarios. I expect to see most of them in real life :D

Actually, you even described this in the PR:

However, they will still not be triggered when drag + dropping an image from a website. This is because the drag event doesn't include file data, but text/uri-list data, which includes the image url(s). Similarly, images won't be uploaded if selecting and copying+dragging/pasting a bunch of html.

In these instances, due to potential CORS issues, you would need to have a server endpoint that downloads and saves images directly from these external urls.

Additionally, it may also be worthwhile having a config option to validate img src, to only insert them when the url matches certain patterns. This could be used to prevent re-uploading previously uploaded content, and restrict hotlinking from other websites.

So, this gets complex. We need configuration (this needs to be opt-in behaviour and we need src validation) and we'll need an endpoint to retrieve external images. Plus, we need to check what we really get from the browsers.

The answer, therefore, is that we skipped those cases for now. We planned to work on this in some, undefined future.

For now, I'm also unsure whether all these scenarios should be handled by one plugin or whether we should split this. There's a risk that this plugin will grow significantly. We should review which potential input options can be handled in similar way to what we do so far. We also need to define configuration in a way which forsee the other input options which may be implemented later.

Interestingly, for issue https://github.com/ckeditor/ckeditor5-upload/issues/5 , the upload works as expected with current version of ckeditor5 (at least on Windows Chrome/Edge from MSPaint).

For my use case at least, I wish to have all images uploaded because:

  • They will need to be synced to other devices and accessible offline
  • Images from free hosting providers often go dead, hit usage limits, prevent hotlinking, etc
  • Including images from non-secure websites shows warnings
  • Data-uri images slow down API calls, and sometimes fail to save if HTML output is too big to fit in database.

Most cases are handled by the current functionality (+ my PR). I will probably filter out any other cases with userland code.

For configuration, maybe it would be useful to let users provide a function, as it could get fairly complex with flags for every use case. For example:

import upload // promise, takes fileData, resolves to upload dest
import uploadFromUrl // promise, takes url resolves to upload dest
import convertDataURI // takes datauri, converts to blob data

// config - upload everything
function handleImage(type, data, url) {
  switch (type) {
    case 'file':
      return upload(data);
    case 'data-img':
      return data ? upload(data) : upload(convertDataURI(url));
    case 'url':
      if (url.match(/^https:\/\/mydomain.com/) return url;
      else return uploadFromUrl(url);
    case 'webkit-fake-image':
      throw new Error('Image upload not supported in Safari.')
  }
}

// config - no uploads, only include small data-img
function handleImage(type, data, url) {
  switch (type) {
    case 'file':
      return null;
    case 'data-img':
      if (url.length < 10000) return url;
      else return null;
    case 'url':
      return url;
  }
}

cc @pjasiun @wwalc

For configuration, maybe it would be useful to let users provide a function, as it could get fairly complex with flags for every use case. For example:

It'd actually need to be simpler because we can only expose one upload() function and should work with File instances, URLs and base64. That's because all these uploads need to go through our FileRepository.

So, a shouldUpload( type, data, url ) -> {Boolean} function should be enough. However, we may also need to extend UploadAdapter so it has a method for uploading files from external websites.


What worries me still is handling all of these in the clipboardInput's listener which you tried to patch. Without analysing all the possible input scenarios I'm not able to tell now how to handle them and whether we don't need something more.

Since a complete solution to this ticket is a longer-term project, I don't want to block https://github.com/ckeditor/ckeditor5-image/pull/212 with these discussions. However, it still requires some work to be merged.

For sure what we have now is an MVP for the image upload, there is much more to be handled.

4 years ago I did a research for CKEditor 4 and I learn that that paste and drop handling is very complex. You can:

  • drop file,
  • paste file,
  • paste image (for instance from MS Paint),
  • drop HTML with image,
  • paste HTML with image,
  • drop HTML image (only image without anything else),
  • paste content from MS Word with an image.

Each browser has (had) some quirks, and you can get a very strange content. It was very often that there were not common patterns, for instance, the same case, on the same browser may work differently depending on the action you did (drop gives you different results than paste), on the OS (different format on Mac and on Windows), etc.

As @Reinmar said, you can get:

  • HTML with base64-encoded images,
  • HTML with images which src attributes point to an external website,
  • file in dataTransfer.files,
  • file in dataTransfer.items,
  • might be more.

For sure sometimes you get mixed content: <img> in HTML and file in dataTransfer.

I am not sure if it should be split between multiple plugins since cases might be mixed. I am not sure about the PR above, because it covers only a small part of cases and I do not know if it will not break other. I am not sure how extensible API we will be able to provide. What I am sure is that we need to do a deep research to learn all cases we need to handle, and then we will be able to design a solution.

Just a note that adding one particular host/domain to the whitelist is not enough as many times companies have/use multiple domains/hosts.

Anyone heavily interested in this feature: please add 馃憤 to the first post.

It'd actually need to be simpler because we can only expose one upload() function and should work with File instances, URLs and base64. That's because all these uploads need to go through our FileRepository.

So, a shouldUpload( type, data, url ) -> {Boolean} function should be enough. However, we may also need to extend UploadAdapter so it has a method for uploading files from external websites.

In general to correctly handle the use case mentioned in the original report there would need to be the possibility to use at least two different upload functions depending on the context. The first upload function already handles uploading local images (also base64-encoded etc.).

The second function, to handle <img> elements with src pointing to external URLs, would have to work differently. Such upload function would need to connect to the server side endpoint and ask it to upload the image (providing just the URL of it) and return back the correct url. It has to be done this way due to CORS restrictions.

In theory, uploading images coming from other domains could happen at a different stage, for example when saving the document in the CMS database, together with other post-processing operations like XSS sanitization. This way the system would be more bulletproof as always someone may intercept the POST request and inject an image coming from external URL anyway.

On the other side, the sooner the user receives a feedback that an image could not be uploaded from external URL the better. There may be situations in which the server side script to fetch images will not work, for example in case of copying an image from a website where only logged in users can see resources or from some intranet domain. Taking this into consideration, making an attempt to make "a local copy" of an image as soon as it gets pasted into the editor makes more sense.

Does it make sens for you to make first step with sth simple like bool shouldUpload(type, data, url) and then _eventually_ extend it?

_(I just don't trust persistence of any external resources so prefer to have all images on my side)_

I am afraid that simple bool shouldUpload(type, data, url) is not enough. To make it works we need to be able to handle upload in at least 3 format:

  • File,
  • Base64 data,
  • URL to the external image.

We need to recognize and handle all cases properly to make this work, what makes this ticket quite complicated.

Lazy upload image:

  1. Query all image tags
  2. Check the "src"
  3. Upload images if not uploaded

@taocz, I didexactly what you suggested:

(async() => {
    let files = [];
    for (let img of images) {
        let response = await fetch(img.src);
        let blob = await response.blob();
        files.push(new File([blob], 'image', response));
        console.log(response);
    }
    editor.execute('imageUpload', { file: files } );
})();
Was this page helpful?
0 / 5 - 0 ratings

Related issues

Reinmar picture Reinmar  路  3Comments

benjismith picture benjismith  路  3Comments

pandora-iuz picture pandora-iuz  路  3Comments

devaptas picture devaptas  路  3Comments

wwalc picture wwalc  路  3Comments