Lit-html: Unicode characters inside style of a html template are not recognized

Created on 31 May 2018 · 15Comments · Source: Polymer/lit-html

Flloing js file is returning "undefined" for the policy.

import {html} from '@polymer/lit-element/lit-element.js';

export const policy = html`
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">

<head>
    <meta content="text/html; charset=UTF-8" http-equiv="content-type"></meta>
    <style type="text/css">
        .list>li:before {
            content: "\0025a0  "
        }
    </style>
    <title>Test unicode in lit-html </title>
</head>

<body>
    <ul class="list">
        <li> list item 1</li>
        <li> list item 2</li>
    </ul>
</body>

</html>`

Medium Bug

Source

phani1kumar

👍2

Most helpful comment

This works. Escape with '\'.

content: "\\0025a0";

RJ77 on 5 Jul 2018

👍2

All 15 comments

I think I'll need more information here. The html template tag should never return undefined - it always returns a TemplateResult.

Does the value really change depending on the unicode escape sequence?

justinfagnani on 31 May 2018

I haven't investigated this whatsoever, but this could be related to how unicode escape sequences are handled by tagged template literals:

bgotink on 1 Jun 2018

this could be related to how unicode escape sequences are handled by tagged template literals

Correct. This is because \0 falls into weird octal escape rules. The correct way is to use either hex escape (\x00), or unicode escape (\u0000).

jridgewell on 1 Jun 2018

This works. Escape with '\'.

content: "\\0025a0";

RJ77 on 5 Jul 2018

👍2

Although template tag html only uses the raw strings, older JS engines still needed to compute the cooked strings.

See Stage 1 Draft / July 26, 2016 Template Literal Revision

Remove the restriction on escape sequences.

Lifting the restriction raises the question of how to handle cooked template values that contain illegal escape sequences. Currently, cooked template values are supposed to replace escape sequences with the "Unicode code point represented by the escape sequence" but this can't happen if the escape sequence is not valid.

You can see this in

function f(strings) {
  console.log(`cooked=${ JSON.stringify(strings) }`);
  console.log(`raw=${ JSON.stringify(strings.raw) }`);
}

function g() {
  "use strict";
  f`\01`;
}

g();

In older JS engines you'd get an error message like the one you describe but in newer ones you get:

cooked=[null]
raw=["\01"]

mikesamuel on 4 Feb 2019

Although template tag html only uses the raw strings

html doesn't use raw strings. For invalid escape sequences, the item this.strings[i] is undefined. This will cause a reference error here but if you don't use interpolation, you'll get 'undefined' because of this line.

RunDevelopment on 21 May 2019

html doesn't use raw strings.

Wow! Thanks for pointing that out.

IMO, it really should.
I would expect

html`<style>li.inline:after { content: "\2c" }</style>`

to correspond to that HTML and not fail because \2 is octal.

mikesamuel on 21 May 2019

IMO, it really should.

I agree.
But in the meantime we can build our own raw-string-version of html:

const raw_html = (strings, ...values) => {
    const newStrings = [...strings.raw];
    newStrings.raw = strings.raw;
    html(newStrings, ...values);
}

RunDevelopment on 21 May 2019

😄1

We tried using raw strings for Polymer 3's template strings, but hit issues with developers expecting JavaScript escape sequences to work properly. For instance, they expected this to work:

html`<pre>a\nb</pre>`

And output "a" and "b" on separate lines.

justinfagnani on 22 May 2019

@justinfagnani Yeah, but it seems easier to do

html`<pre>a${'\n'}b</pre>`

to embed newlines than to have to remember to double-escape in

html`<script>alert('\n')</script>`

mikesamuel on 22 May 2019

@mikesamuel Are you sure? That's a JavaScript string embedded in another JavaScript string. We always have to escape in that case. I think it's more unfamiliar to developer to say that _in this case_ you don't have to double escape.

I also hope that devs are almost never putting JavaScript inside their lit-html templates. It doesn't do much anyway.

I think reasonable engineers can disagree here on the basic point of whether lit-html templates are JavaScript strings or HTML. Given that we have other deviations from plain HTML, I tend to think of them as JavaScript strings still, that contain markup.

justinfagnani on 22 May 2019

That's a JavaScript string embedded in another JavaScript string. We always have to escape in that case.

String.raw`<script>alert('\n')</script>`;

That's what I always do when I have to handle code in JS because you can just write code like you always do.

Scripts inside templates aren't the only problem. The original reason for this issue was simple CSS.
I also had issues when the template contained Java code (embedded in a code tag for display).

Given that we have other deviations from plain HTML

@justinfagnani Are these documented somewhere?

RunDevelopment on 22 May 2019

I filed an issue on eslint-plugin-lit to warn on illegal escape sequences. cc @43081j

justinfagnani on 24 May 2019

consider supporting using a special placeholder for Unicode escape in lit-eleemtn env.

zzzgit on 6 Sep 2019

@justinfagnani We could probably pursue a change to spec to remove the octal reservations from https://tc39.es/ecma262/#prod-NotEscapeSequence. It's a simple enough change, and has a clear cut use case.

jridgewell on 6 Sep 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Promised attributes are not rendered properly

depeele · 3Comments

Optimizations

justinfagnani · 3Comments

Add global variable for framework detection

justinfagnani · 4Comments

Please provide a browser-friendly version of lit-html, or a babel config/command to make one

erichiggins · 4Comments

Update guard() to support multiple values.

justinfagnani · 3Comments