Ecma262: \8 in sloppy non-template strings remains unspecified

Created on 9 Jun 2020  ·  9Comments  ·  Source: tc39/ecma262

Currently, JSC bans \8 (and \9) in templates and strict strings but SM/V8/XS/Ch all allow it.

At first I thought this was simply due to a lack of tests, so I created https://github.com/tc39/test262/pull/2654, but it turns out this has never been specified at all (though it was planned for ES7).

I still view the tests above as the goal, because:

  • We definitely should not leave this unspecified.
  • 11.8.4 and 11.8.6 are clear about disallowing B.1.2 for templates and strict strings.
  • B.1.1 has a non-octal production, so it makes sense for B.1.2 to follow suit.

Assuming no one is deeply opposed to this, I'll make a PR and present it at the next meeting.

Most helpful comment

I can elaborate on the Web usage of \9 in CSS: it’s a “CSS hack” that targets IE9 and older. See https://mathiasbynens.be/notes/safe-css-hacks#css-hacks for details.

.foo {
  color: red;
  color: green\9; /* IE9 and older */
}

The intention of the document.write examples you found was likely to write code like that to the document, except they forgot to escape the \ character in the string literal.

All 9 comments

I don't like calling things like this "unspecified". There are source texts which must be accepted by a conforming implementation, source texts which must be rejected by a conforming implementation (due to Forbidden Extensions or Early Errors, maybe other reasons), and source text which may be accepted by a conforming implementation but are not required to be.

The example of \8 and \9 in strings falls in to the last case. It can be considered an allowed language extension. If it is necessary for a web browser to implement this language extension to be compatible with web content, we should require it in Annex B. If it is not, we should consider adding it to the Forbidden Extensions because it is terrible.

Hmm, sorry, I wasn't seeing an issue with having "unspecified" and "non-forbidden" be synonymous.

I don't think there's a web compatibility concern for templates and strict strings, but are you saying that we should try to ban these even in sloppy non-template strings? I figured it'd be the least surprising to have non-octals allowed just when legacy octals are allowed, but we could certainly consider going further.

Yeah, if we can get rid of them, we should. I don't care about consistency with numbers. I don't know if implementations are going to be willing to go through the effort of figuring out whether these are necessary for web compatibility or not, though.

To clarify: the unspecified behavior discussion is only relevant for non-template strings, right?

11.8.6 does say that "A conforming implementation must not use the extended definition of EscapeSequence described in B.1.2 when parsing a TemplateCharacter." This should mean that Annex B discussion isn't relevant for template strings and

`\8`

should be an error due to the NotEscapeSequence behavior.

FWIW, Hermes allows \8 in strict string literals, but specifically rejects \8 in untagged template literals due to this.

@avp Yes I believe that is correct.

edit: To be very clear, I just mean untagged templates. Tagged templates have much more freedom with what can follow \.

B.1.2 says “The syntax and semantics of 11.8.4 is extended as follows, etc.”, except that it is not _extended_, it is _patched_.

So, while you’re at it, you could make the spec less schizophrenic, i.e., merge non-annex-b and annex-b grammars and semantics, and add early error rules for non-strict mode.

Hmm, so upon investigating further, there are two points worth noting:

  1. Currently all engines agree that '\8' is '8' and not '\x08' in sloppy mode, in spite of no spec:
```
λ eshost -sx "print('\7'.charCodeAt(0), '\8'.charCodeAt(0));"
#### ch, jsc, sm, v8, xs
7 56
```

  1. From a cursory look at some HTTP Archive data, @syg and I noticed a number of pages using \9 when document.writeing CSS into a page. 🤔 For instance, background: #000\9; or width: auto\9;.
 Now, `\x09` is the same as `\t`...but given (1), `\9` is the same as just `9`. So it seems like these pages are rendering in spite of malformed CSS, but then they'd fail to render at all if `'\9'` were illegal.

It's hard to know for sure just how many pages would break due to (2), but I think it may be wisest to just spec (1) as web reality. 😓

I can elaborate on the Web usage of \9 in CSS: it’s a “CSS hack” that targets IE9 and older. See https://mathiasbynens.be/notes/safe-css-hacks#css-hacks for details.

.foo {
  color: red;
  color: green\9; /* IE9 and older */
}

The intention of the document.write examples you found was likely to write code like that to the document, except they forgot to escape the \ character in the string literal.

Wow.

Was this page helpful?
0 / 5 - 0 ratings