Ecma262: [[SourceText]] type problem

Created on 27 Feb 2019 · 16Comments · Source: tc39/ecma262

From PR #697, each ECMAScript function object now has a [[SourceText]] internal slot. Table 27 says that its type is String. However, most of the steps where it is set are of the form:

Set F.[[SourceText]] to the source text matched by |Nonterminal|.

and source text is "a sequence of [Unicode] code points", which is not a String.

Presumably, [[SourceText]] should get the result of UTF-16 encoding the source text.

(Alternatively, you could defer the UTF-16 encoding to the point where [[SourceText]] is used, in Function.prototype.toString. So the type of [[SourceText]] would be something like "Unicode code points". But then you'd have to UTF-16 decode _sourceText_ in CreateDynamicFunction, which seems silly.)

needs editorial changes spec bug

Source

jmdyck

All 16 comments

what if we changed the definition of "source text matched by" to include utf-16 encoding

devsnek on 27 Feb 2019

That would pretty much conflict with other uses of the term "source text".

jmdyck on 27 Feb 2019

cc @michaelficarra

ljharb on 27 Feb 2019

I would continue to keep the span of code points in [[SourceText]], UTF-16 encode [[SourceText]] in Function.prototype.toString, and WTF-16 decode the sourceText string built in CreateDynamicFunction.

michaelficarra on 27 Feb 2019

@michaelficarra: Interesting. Why do you prefer that alternative?

jmdyck on 27 Feb 2019

Actual UTF-16 encoding won't work because source text can contain unpaired surrogates, but using UTF16Encoding would be fine.

gibson042 on 2 Mar 2019

👀1

Right, but note that you can't just pass the whole source text to UTF16Encoding, because it only takes a single code point.

If we take @michaelficarra's preferred approach ([[SourceText]] is code points), encoding only happens in one spot, so it would be enough to use the phrasing that occurs in a couple other places -- the String whose code units are the UTF16Encoding of each code point of [some source text]

But if we take the other approach ([[SourceText]] is a String), then encoding happens in 26 spots, so it might be worth defining an operation for that phrasing. But then you end up saying:

Set F.[[SourceText]] to ThatOperation(the source text matched by |Nonterminal|).

which is a bit clunky.

Instead, it might be better to define an operation that takes the Parse Node as the argument, so you get:

Set F.[[SourceText]] to Whatever(|Nonterminal|).

(I have no good suggestion for the name of either operation.)

jmdyck on 2 Mar 2019

👍1

If I can get an editorial decision on which way this should go, I'll prepare a PR.

jmdyck on 23 May 2019

https://github.com/tc39/ecma262/issues/1458#issuecomment-467969794 seems simplest to me, and I’d generally prefer to defer to @michaelficarra for toString questions anyways.

ljharb on 23 May 2019

@michaelficarra: I'm curious about Type(func.[[SourceText]]) is String in F.p.toString. For what cases were you expecting the test to fail?

jmdyck on 23 May 2019

@jmdyck It's there to prevent Function.prototype.toString from returning non-string values in the event that the host decides to store a non-string in the [[SourceText]] slot.

michaelficarra on 24 May 2019

store a non-string in the [[SourceText]] slot.

That would be non-compliant behavior for an ordinary function, so you're talking about an exotic function that elects to have a [[SourceText]] slot, right?

jmdyck on 25 May 2019

Any exotic object, yes.

michaelficarra on 25 May 2019

👍1

Also, while we're in the neighborhood, I noticed that async arrow functions don't get their [[SourceText]] set. Is there a reason that wasn't added in #697?

jmdyck on 25 May 2019

That’s likely an oversight. A PR to fix that would be great.

ljharb on 25 May 2019

Done.

jmdyck on 26 May 2019

❤1

Was this page helpful?

0 / 5 - 0 ratings