Slate: Problems with the new Html deserializer

Created on 21 Jul 2017  路  12Comments  路  Source: ianstormtaylor/slate

Do you want to request a feature or report a bug?

A bug.

What's the current behavior?

The new Html deserializer doesn't work with the default parseHtml function. You can see the errors trying to paste some html in the paste html example.

First you have this error:
Uncaught TypeError: _this.parseHtml is not a function

That could be resolved by passing explicitily null to Html constructor like this, but I think it should be fixed to also use the default parser if the options is undefined:

const serializer = new Html({
    rules: RULES,
    parseHtml: null
});

Then I created this fiddle to show the next error: https://jsfiddle.net/6u1e543z/

Uncaught TypeError: elements.filter is not a function

This happen because elements is a NodeList, not an Array so filter is not defined there.

I don't know if there are more, couldn't get past this one.

What's the expected behavior?

The Html should work as before after updating the deserialize methods.

Most helpful comment

Here's my html serializer rules

const BLOCK_TAGS = {
    p: 'paragraph',
    ul: 'bulleted-list',
    ol: 'numbered-list',
    li: 'list-item',
    h3: 'heading-three',

};

const INLINE_TAGS = {
    a: 'link'
};

// Add a dictionary of mark tags.
const MARK_TAGS = {
    em: 'italic',
    strong: 'bold',
    u: 'underlined',
};

const rules = [
    {
        deserialize: function(el, next) {
            const type = BLOCK_TAGS[el.tagName];
            console.log(el);
            if (!type) { return; }
            return {
                kind: 'block',
                type: type,
                nodes: next(el.childNodes)
            };
        },
        serialize: function(object, children) {
            if (object.kind != 'block') { return; }
            console.log(object);
            switch (object.type) {
                case 'numbered-list':
                    return <ol>{children}</ol>;
                case 'bulleted-list':
                    return <ul>{children}</ul>;
                case 'list-item':
                    return <li>{children}</li>;
                case 'paragraph':
                    return <p>{children}</p>;
                case 'heading-three':
                    return <h3>{children}</h3>;
                case 'link':
                    return <a>{children}</a>;
            }
        }
    },
    // Add a new rule that handles marks...
    {
        deserialize: function(el, next) {
            const type = MARK_TAGS[el.tagName];
            if (!type) { return; }
            return {
                kind: 'mark',
                type: type,
                nodes: next(el.childNodes)
            };
        },
        serialize: function(object, children) {
            if (object.kind != 'mark') { return; }
            switch (object.type) {
                case 'bold':
                    return <strong>{children}</strong>;
                case 'italic':
                    return <em>{children}</em>;
                case 'underline':
                    return <u>{children}</u>;
            }
        }
    },
    {
        deserialize: function (el, next) {
            if (el.tagName != 'a') { return; }
            const type = INLINE_TAGS[el.tagName];

            if (!type) {
                return;
            }
            return {
                kind: 'inline',
                type: type,
                nodes: next(el.childNodes),
                data: {
                    href: el.attrs.find(({name}) => name == 'href').value
                }
            };
        },
        serialize: function (object, children) {

            if (object.kind != 'inline') {
                return;
            }

            switch (object.type) {
                case 'link':
                    return <a href={object.data.get('href')}>{children}</a>;
            }
        }
    },
];

const html = new Html({rules, parseHtml: null});

and this is the example I'm using

this.state = {
            state: html.deserialize('<p>In addition to block nodes</p>')
        };

initializing the state with the Raw serializer works fine

All 12 comments

I can't get the html deserializer to work either. The error I'm getting is slightly different:

TypeError: Argument 1 ('child') to Node.removeChild must be an instance of Node

I'm using the latest version of this package.

I think you should update your deserializer functions as shown in this commit: https://github.com/ianstormtaylor/slate/commit/4bbf7487ea827bffa5a253fd0649190944dfb4ee

Basically el.children became el.childNodes, but there's a catch for links too.

Here's my html serializer rules

const BLOCK_TAGS = {
    p: 'paragraph',
    ul: 'bulleted-list',
    ol: 'numbered-list',
    li: 'list-item',
    h3: 'heading-three',

};

const INLINE_TAGS = {
    a: 'link'
};

// Add a dictionary of mark tags.
const MARK_TAGS = {
    em: 'italic',
    strong: 'bold',
    u: 'underlined',
};

const rules = [
    {
        deserialize: function(el, next) {
            const type = BLOCK_TAGS[el.tagName];
            console.log(el);
            if (!type) { return; }
            return {
                kind: 'block',
                type: type,
                nodes: next(el.childNodes)
            };
        },
        serialize: function(object, children) {
            if (object.kind != 'block') { return; }
            console.log(object);
            switch (object.type) {
                case 'numbered-list':
                    return <ol>{children}</ol>;
                case 'bulleted-list':
                    return <ul>{children}</ul>;
                case 'list-item':
                    return <li>{children}</li>;
                case 'paragraph':
                    return <p>{children}</p>;
                case 'heading-three':
                    return <h3>{children}</h3>;
                case 'link':
                    return <a>{children}</a>;
            }
        }
    },
    // Add a new rule that handles marks...
    {
        deserialize: function(el, next) {
            const type = MARK_TAGS[el.tagName];
            if (!type) { return; }
            return {
                kind: 'mark',
                type: type,
                nodes: next(el.childNodes)
            };
        },
        serialize: function(object, children) {
            if (object.kind != 'mark') { return; }
            switch (object.type) {
                case 'bold':
                    return <strong>{children}</strong>;
                case 'italic':
                    return <em>{children}</em>;
                case 'underline':
                    return <u>{children}</u>;
            }
        }
    },
    {
        deserialize: function (el, next) {
            if (el.tagName != 'a') { return; }
            const type = INLINE_TAGS[el.tagName];

            if (!type) {
                return;
            }
            return {
                kind: 'inline',
                type: type,
                nodes: next(el.childNodes),
                data: {
                    href: el.attrs.find(({name}) => name == 'href').value
                }
            };
        },
        serialize: function (object, children) {

            if (object.kind != 'inline') {
                return;
            }

            switch (object.type) {
                case 'link':
                    return <a href={object.data.get('href')}>{children}</a>;
            }
        }
    },
];

const html = new Html({rules, parseHtml: null});

and this is the example I'm using

this.state = {
            state: html.deserialize('<p>In addition to block nodes</p>')
        };

initializing the state with the Raw serializer works fine

Here's the actual error I'm getting:

elements.filter is not a function. (In 'elements.filter(_this.cruftNewline)', 'elements.filter' is undefined

That's the error I'm getting too, for now I've reverted back to 0.20.x

Hey, sorry for the troubles, looks like this was introduced in my recent PR.

Sounds like there are two issues. First, this line should be changed to if (typeof options.parseHtml === 'function') to correctly account for it being undefined (or something else). This wasn't caught by the tests because the tests don't run in a browser environment so they always provide parse5.parseFragment to that option -- perhaps we should have some browser tests to better catch things like that.

Second, the childNodes produced by the native DOMParser is not an array (parse5 does seem to produce an array). This can be fixed by using Array.from(childNodes) to ensure we're dealing with an array before using array operators. Also would have likely been caught by a real browser test.

I'll create a PR to implement these fixes. cc @ianstormtaylor

@schneidmaster I just submitted a PR, please have a look.

Note: I simply test for options.parseHtml to be truthy.

@schneidmaster update - closing my PR, just realized that those incompatible methods need to work with whatever parser is passed in as well as with the DOMParser API.

@erquhart Sorry, just seeing this. I still see a couple issues in testing your branch & a couple spec failures. Opened a PR with my fix, specs pass and I can't find any problems in the "Paste HTML" example locally.

@ianstormtaylor when you get a moment, can you check out my PR above? Also, I'm about to leave town for the weekend, so I may not be responsive until Monday.

Got the same issue, the typeof check also not works for HTML nodeList, it's breaking. both native DOMParser and parse5 not works for me, one other issue with parse5 is it requires Buffer support and some native modules from node.js, and that would make the bundle much bigger.

For people arriving here one year later. I was running into the same issue in a tests environment with Docker + Karma + PhantomJS. PhantomJS didn't have access to the DOMParser API so it would fail. I tried adding JSDOM.fragment as suggested in the docs but it wasn't working. The workaround was to move away from PhantomJS to a jsdom environment throught https://github.com/badeball/karma-jsdom-launcher. No longer needed to specify a parseHtml prop and bye-bye to the deprecated PhantomJS !

Was this page helpful?
0 / 5 - 0 ratings