Symbols cannot be used as WeakMap keys as they are primitives. However unlike other primitives they are unforgable and unique, so they should be completely safe for use as WeakMap keys.
Global symbols (Symbol.for, Symbol.keyFor) are not unique, and as primitives, can't ever be collected - I assume that since these kinds of symbols can't be WeakMap keys, and because it would be confusing for some symbols to be usable as WeakMap keys but not others, that it makes the most sense to disallow all symbols as WeakMap keys?
Hi Jordan, yes, I went down the same chain of reasoning.
On Tue, May 15, 2018 at 11:59 AM, Jordan Harband notifications@github.com
wrote:
Global symbols (Symbol.for, Symbol.keyFor) are not unique, and as
primitives, can't ever be collected - I assume that since these kinds of
symbols can't be WeakMap keys, and because it would be confusing for some
symbols to be usable as WeakMap keys but not others, that it makes the most
sense to disallow all symbols as WeakMap keys?—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/tc39/ecma262/issues/1194#issuecomment-389276760, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAQtzAoxX7S4Mmw_fFAStvveTlzCvXcLks5tyyWJgaJpZM4UADYc
.
--
Cheers,
--MarkM
i would lean towards allowing global symbols since they aren't the only things that could prevent collection. any value used as a weakmap key can be attached places such that it won't be collected. beyond that, Symbol() symbols should definitely be allowed.
I don't see a significant difference between a registry-Symbol and an object that's stored on the global. Both will never be collected, and thus will prevent a weak value from being collected.
But if we really do think this is a footgun, then we can just disallow registry-Symbols from being used (those for whom Symbol.keyFor returns a non-undefined value). The registry can't be manipulated by the user; Symbols are inserted into it by the UA immediately upon creation and never removed, so "in the registry" is effectively a constant quality of the Symbol itself, and can be relied on.
It just seems silly that {} can be a key but Symbol() can't, given that they can serve similar purposes.
You can always prevent collection; the difference is that with objects you can always allow collection by dropping all refs to the object - or by having the realm itself collected.
With global symbols - which are cross-realm - it would prevent ever collecting it.
@ljharb If a WeakMap is reaped it does not prevent collecting values with eternal keys since that weak reference to the value is removed, even if the key itself is strongly held still.
@erights On a tagent that might come back to this, is there any reason having keys mismatch between Weak collections and WeakRef might be problematic? I know for WeakRef it would never fire a finalizer for a Symbol from the SymbolRegistry if it were allowed as a key (but it would strongly keep the finalizer/holdings alive I think?).
With global symbols - which are cross-realm - it would prevent ever collecting it.
Isn't the same true of the outermost global object?
@gibson042 if the realm is collected, the global object for it could also be (assuming it was a key in a WeakMap from a different realm)
Right, but I was referring to the global object of the outermost realm (though I suppose such a statement reads as vacuously true). Still, the set of well-known Symbols shared across realms is necessarily bounded and small, and their uncollectability shouldn't be a concern. Or am I missing something?
I'm not sure this talk about keys really matters if the WeakMap being reaped allows what it is holding be reaped? Even if I create a map using:
let map = new WeakMap();
map.set(Symbol.iterator, BIG_OBJECT); // Symbol.iterator is shared between all realms as well
map = null;
BIG_OBJECT can still be collected since the map can be collected.
One thing that comes to mind here is that doesn't seem to be polyfillable without leaking. Are there any concerns around that?
@loganfsmyth isn't that a concern with any Weak collection polyfill?
@bmeck traditionally the polyfills assign a "hidden" property on the key that holds the value. that can't be done with a Symbol.
In a recent project, I resorted to implementing a class called UniversalWeakMap, which maintains a WeakMap and a Map instance property, called this._weakMap and this._strongMap, and simply stores any non-reference keys (which can't be stored in this._weakMap) in the this._strongMap instead, so that it can expose a uniform get/set interface that works for any kind of key. Of course the non-reference keys are strongly held (including Symbols, unfortunately), but the whole UniversalWeakMap object could potentially be garbage collected.
Note: for this particular project, I didn't need to match the shared WeakMap/Map interface exactly, which is why my UniversalWeakMap doesn't have methods like has and delete, though they would be easy to implement.
I mention this example as evidence that it would be genuinely useful to have fewer restrictions on what you can put in a WeakMap, as long as you don't mind the drawbacks (in particular, you don't care about iterability).
To the specific question of whether Symbol keys should be allowed in WeakMaps, that would certainly be more convenient than the status quo. However, I would go a step further, and recommend that any kind of key should be allowed in a WeakMap. None of the keys (weak or strong) should be iterable, obviously. Those keys that can be garbage collected should disappear from the WeakMap whenever they become unreachable, and those keys that can't be collected (either because they're a kind of value that can never be collected, or they just never happen to become unreachable) should simply remain in the WeakMap until they are explicitly removed by the program.
In either case, a native WeakMap implementation will not keep any keys from being collected that could otherwise be collected, so I don't see any potential for confusion or memory leaks—in a native implementation, at least, which is why this would need to be standardized rather than just polyfilled.
forgable values can't be done because we want to prevent observing gc
@devsnek Forgeable Symbol values by definition can't be collected, since they have to be === if you forge them again later, so there's nothing to observe?
Observing GC isn’t a concern; WeakRefs are coming, either via JS or via WASM, so it will be observable.
Especially once we have WeakRefs and thus observable GC (but even right now), I honestly don't see the benefit (for developer expectations or memory usage or any other reason) for refusing to store non-reference keys in a WeakMap.
Observable gc absolutely is a concern. That's why we separate weakref from
weakmap and put weakref on the System object.
However, I agree it is not relevant to this thread. Non ref keys would
never be collected so there's nothing too observe.
On Thu, May 24, 2018, 6:56 PM Jordan Harband notifications@github.com
wrote:
Observing GC isn’t a concern; WeakRefs are coming, either via JS or via
WASM, so it will be observable.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tc39/ecma262/issues/1194#issuecomment-391890792, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAQtzGEaQDgObbLfCJFlAQLTAkw0PtKVks5t1zqLgaJpZM4UADYc
.
@erights thanks for clarifying.
In that case, can anyone elaborate on why WeakMaps can't accept all values as keys? (@erights, @allenwb?)
Personally, I expect that if I put a thing in a WeakMap then the corresponding value can someday be GC'd. For example, I have tons of code which does stuff like
const cache = new WeakMap;
function process(data) {
if (cache.has(data)) {
return cache.get(data);
}
const res = expensiveAlgorithm(data);
cache.set(data, res);
return res;
}
with the expectation that this caching is acceptable because the caller can always choose to drop data if they want res to stop taking up memory.
I really don't think it's a good idea to mix strong and weak holding in the same data structure. Yes, I know that sometimes the global object ends up being effectively strongly held, but that's an extremely minor and extremely edge-y edge-case.
Sorry to sidetrack, but what do you mean by (un)forgeable values?
@woess forgability refers to the ability to create a value that has the identity of another value without having access to the original value. two examples of this are numbers and strings. i can create 1 and have it be identical to this other 1 over here, without having access to the original 1. as with strings i can create "hello" in one place and "hello" in another place and they are identical.
on the other hand we have things like objects and symbols which are unforgable. if i have Symbol() in some place and Symbol() in another place they will not be identical. The only way to have identity is to grab that original symbol. The same goes with objects ({} === {} is false)
@devsnek Good explanation of unforgeable. However, only unnamed symbols --- those created by the Symbol() expression you showed --- are unforgeable. Named symbols --- those created by Symbol.for(str) --- are not unforgeable, creating the dilemma at issue in this thread.
@erights I don't think that is true, the GlobalSymbolRegistry is returning the exact same Symbol rather than creating new primitive values in https://tc39.github.io/ecma262/#sec-symbol.for . Those things are reachable but not forgeable. I cannot recreate them without access to that registry.
Access to the registry is not deniable. Given that everyone has implicit access to the registry, they can obtain access to any named symbol given only knowledge of the string. IOW, access to a named symbol is "knowledge limited" rather than "access limited".
We were very careful to design the semantics of the registry so that it would not be a global communications channel. Given the way the semantics of the registry are stated, this safety property is hard to see. A better way to describe its semantics is that a named symbol is a value, without identity, that wraps a string. All the equality comparison operators, given two named symbols, judges them to be equal iff their wrapped strings are equal. This account is not observably different and need not hypothesize any registry or any other form of shared state. In this account, it is obvious there is no global communications channel.
@erights https://tc39.github.io/ecma262/#sec-samevaluenonnumber appears to compare that they are the same value, not based upon any internal string that I can tell. It could be implemented as potentially multiple values being checked by some internal string that appear to act as a single value, but is not what the spec appears to be saying. Do you think this a bug in the specification?
@erights it seems like you could deny access to the registry by replacing it with one that wasn’t truly using the global registry, thus ensuring that global symbols created in the restricted realm weren’t available to other realms?
Please don't special-case registered symbols. It is already possible to add keys to a WeakMap that will not be collected from it, whether because such collection is strictly impossible (e.g., the outermost global object or the WeakMap itself) or because the keys happen to remain reachable (e.g., built-in or user-defined globals). Considering consistency, complexity, and cognitive burden, the criteria pretty much has to be based on type and supports only two possibilities—1) current behavior of requiring keys to be non-primitive, or 2) the proposed behavior of requiring keys to be of a type for which SameValueNonNumber compares by reference (matching user experience of strict equality comparison). It's not worth adding symbols if they come with a global registry exception.
And if symbols _are_ permitted as WeakMap keys, they should be allowed in WeakSets as well.
I don't think it's a good idea to allow any values shared across realms be used as WeakMap keys.
I'd also like to know if there's any use case that would justify allowing (unique) symbols. So far this part of the question has not been addressed at all.
Just to throw my hat in the ring, I'm currently building a system that uses unique symbols as property keys in order to avoid naming conflicts, as I have a lot of code generated internally at runtime in response to arbitrary changes to ontological context. When calculating hashes of different data structures, symbols are a challenge to hash because though it is easy to associate them with a numeric value, if a symbol's associated ontology is no longer in use, there's no way for me ensure that a cached hash for that symbol is also cleaned up. You can box a symbol within an object of course, but that won't work as a WeakMap key because I have to keep a reference to the object somewhere in order to use it for lookup purposes.
In a nutshell, allowing symbols to be used as WeakMap keys provides a way to associate them with metadata that can be dropped when the symbol reference is orphaned.
With respect to the argument "...but global symbols will never be cleaned up" - this is true anyway for objects isn't it? For example:
weakmap.set(Date, 'This will never be cleaned up');
weakmap.set(document, '...nor will this');
weakmap.set(WeakMap, '...or this');
Date = null; WeakMap = null; weakmap.constructor = null; might be enough to clean up two of those, but it’s certainly true that there’s objects you wouldn’t be able to get rid of. weakmap.set(weakmap, true) presumably never would either.
weakmap.set(weakmap, true) presumably never would either.
Why not?
@erights ah, i guess it would, scratch that.
How would a weakmap.set(weakmap, true) item ever be collected _from_ the map, as opposed to _with_ the map? Likewise for use of %ObjectPrototype% or %FunctionPrototype% as keys, or the outermost global object.
I believe the concern here is that realms can be collected but the well known symbols never will be. I don't understand the concern but also not very familiar with implementations.
I believe the concern here is that realms can be collected but the well known symbols never will be.
yes.
I don't understand the concern but also not very familiar with implementations.
It's not an implementation issue but a semantic one.
A key can be unobservably collected from a WeakMap only if the key can never again be looked up, because that key can never again exist. A well known or registered symbol might always exist again, and therefore might be looked up at some point in the future.
It is true that a fresh unregistered symbol can be known to never exist again, which is why we could consider allowing only these as keys in a WeakMap. However, since these are otherwise so similar to well known or registered symbols, we decided against it.
Thus, we have the simple rule that
Object(k) === k iff k can be a key in a WeakMap
How would a
weakmap.set(weakmap, true) item ever be collected from the map, as opposed to with the map?
Correct. So long as weakmap itself is reachable, it will not be collected as a key from itself. If weakmap is unreachable, then it can be collected as a key from itself, but this is not in any sense an observable statement.
Likewise for use of %ObjectPrototype% or %FunctionPrototype% as keys, or the outermost global object.
None of these are necessarily reachable from weakmap.
There is also an implementation considerations: Disallowing symbols as WeakMap keys increases the design alternatives space for JS engines and garbage collectors.
The existence of weak maps has a direct impact pact upon the a garbage collection and if not very carefully designed can have a significantly negative performance impact. Allowing symbols as WeakMap keys could, at the very least, complicate such designs and have an impact on other seemingly unrelated parts of an an engine design.
Some possible impacts: Some implementations might decide it makes sense to encode symbols as immediate values within tagged object references or use some common base representation for strings and symbols that is different from that used for objects. These approaches might simplify and hence speedup property lookup. But, there needs to be affordances in the GC that helps it detect when values used as weak keys no longer have any references. A good GC design would probably only want those affordances to apply to values that are actually used as weak keys as they typically have space and/or time overhead that you would like to avoid for the vast majority of values that are never used as weak keys. If the affordances require some per value state this might precluded using immediate values for symbol values (or flipping things, use of immediate symbol values might preclude the use of certain GC affordances).
To be more concrete. Allowing symbols keys in WeakMap might precluded (or significantly complicate) a design from both having immediate symbol values and using an inverted representation of weak maps.
There is also an implementation considerations: Disallowing symbols as WeakMap keys increases the design alternatives space for JS engines and garbage collectors.
Thanks @allenwb I stand corrected.
But implementations should keep in mind the expense of this technique. There are an infinite number of potential fresh unregistered symbol identities. With only an immediate representation, each one would need to be a unique bit pattern, say with a large counter. When this counter reaches its limit, the agent would need to be preemptively terminated, just as it must for OOM, even if there's plenty of free memory at that time. By avoiding heap allocation, in the limit, such termination becomes inevitable.
OTOH, one can imagine a separate immediate-symbol-representation-compactor that runs in this emergency. But this makes the scheme overall more complicated, not less.
I would think that a feature with potential value should generally never be blocked by implementation challenges except where the feature is fundamentally at odds with the goals described by the specification.
As an alternative approach, would there be any value in introducing some sort of immutable object reference/handle as a property of the symbol primitive? Something like this:
const wm = new WeakMap();
const a = Symbol('A');
wm.set(a.ref, { /* metadata */ });
typeof a.ref === 'object'; // true
delete a.ref; // no-op or error
a.ref = {}; // no-op or error
a.ref.foo = 'bar'; // no-op or error? ... or permitted?
// Use case:
for (let p in Object.getOwnPropertySymbols(target)) {
const metadata = wm.get(p.ref);
if (metadata !== void 0) {
// ... reflect
}
}
What other approaches could allow metadata to be discoverable from a symbol reference, and to be garbage collected when the symbol reference is orphaned?
@erights
well, immediately encoded symbols was just one example of how disallowing Symbol weak map keys increases the space implementation design alternatives.
I'm not sure whether limiting the Symbol space via immediate encoding would be problem. I haven't worked through the numbers. NaN encoding would give you 52 bits of range. 64-bit tagged object references should give you 60 or more bits of range. The ES specific doesn't guarantee that programs won't run time resource exhaustion situations.
I would think that a feature with potential value should generally never be blocked by implementation challenges except where the feature is fundamentally at odds with the goals described by the specification.
I would say the exact opposite. Every time a new features is proposed its impact on current and future implementation needs to be considered and the speculative value of the feature needs to be traded-off against the speculative impact upon implementations. This isn't just about ease of implementation, I'm a strong believer that engine implementors should be earning their salaries. But, sometimes seemingly trivial features have significant non-local impacts on implementation level design that lead to significant performance impacts today or in the future.
@allenwb Fair points to be sure, and I absolutely agree - to rephrase though, what I meant was that I believe it is important to exhaust potential approaches to a challenge as much as possible. Often there are good solutions that are not immediately apparent, but often not arrived at when it matters, due to ceasing exploration of the problem prematurely.
Note: Just looking at this proposal, which perhaps mitigates the metadata issue...?
https://github.com/michaelficarra/proposal-first-class-protocols
But implementations should keep in mind the expense of this technique. There are an infinite number of potential fresh unregistered symbol identities. With only an immediate representation, each one would need to be a unique bit pattern, say with a large counter. When this counter reaches its limit, the agent would need to be preemptively terminated, just as it must for OOM, even if there's plenty of free memory at that time. By avoiding heap allocation, in the limit, such termination becomes inevitable.
I don't understand how this is a concern for Symbols, but not for other random objects. As far as WeakMaps are concerned, there's no difference between let key = {}; wm.set(key, val); and let key = new Symbol(); wm.set(key, val);. Is this what you mean by "avoiding heap allocation"?
@tabatkins
I am responding to @allenwb 's suggested possibility of an immediate-only (heapless) representation of symbols. All things which currently are potential weakmap keys (objects and functions) are heap allocated, so within the domain of the garbage collector to recycle their representation, while faithfully maintaining the illusion that there are an infinite number of potential fresh object identities. As long as Symbols are heap allocated, this isn't a problem for them either.
Although it is a problem in theory for an immediate-only representation of infinitely fresh things, @allenwb is right that an internal representation using increasing values from a 128 bit counter is effectively infinite for all engineering purposes.
I recently had the desire to use Symbols are WeakMap keys. I wanted to create a branding mechanism for my custom-made classes, so that in the case of code like
import Mixin from 'lowclass/Mixin'
import Class from 'lowclass'
const FooBrand = Symbol('FooBrand') // <--- THIS
export default Mixin(Base => {
return Class('Foo').extends(Base, ({Protected, Private}) => ({
// ... public props and methods ...
protected: {
// ... protected props and methods ...
},
private: {
// ... private props and methods ...
},
}, FooBrand) // <--- PASSED IN HERE
})
where Protected and Private are helpers for accessing protected or private members in the sense that Private(this).foo is similar to wm.get(this).foo, that I could allow for Private and Protected access to work across instances made from the same class where the class is generated multiple times from differing invocations of the mixin.
Basically, passing the FooBrand should turn on "position privates" instead of "lexically scoped privates" (those concepts described in https://github.com/tc39/proposal-class-fields/issues/60).
Inside my Class() implementation, I wanted to use the passed-in symbols as WeakMap keys, but discovered that I can't, so I've switched to doing the following, which isn't as clean though it'll work just fine:
const FooBrand = { brand: 'Foo' } // <--- THIS
// ...
}, FooBrand) // <--- PASSED IN HERE
It makes semantic sense to be able to use Symbol('FooBrand') instead of {brand: 'Foo'} in my example. When we see {brand: 'Foo'} in the code, it raises questions like "What does the brand option do? What values can it have?" etc. Although the Symbol version also raises its own questions, I believe those questions are a subset of the ones raised by the object version, and Symbol is more clear that something more "meta" is being achieved with the class definition, without unused data hanging around.
I'm aware that if I do something like weakmap.set(Math, LARGE_OBJECT) that LARGE_OBJECT probably won't be collected, because even if possible, let's face the fact that no one is likely to delete global references to Math or other similar global objects.
I'm okay with (especially unregistered) Symbols as WeakMap keys in order to use my own symbols as keys.
I prefer to let _all_ symbols be treated the same (all usable as WeakMap keys). A developer needs to know that passing a Symbol.for() as key to a WeakMap means the value won't be collected; this is obvious. In general, someone who makes use of utilities like well-known symbols or registered symbols is very likely to be the type of person that will know that using those as WeakMap keys isn't a good idea.
Do implementations not GC symbols themselves when they are no longer referenecable? If Symbols participate equally in the GC I see no reason why they shouldn't participate equally in the WeakMap. But if they do not, then that would seem an adequate argument to me.
Only unregistered symbols can conceptually be gc'ed. The storage for a registered symbol might be gc'ed, but the identity is forever. Hence, weakmaps cannot use symbols as keys.
I'd think a registered symbol could be GC'd as well, as long as nothing else had a reference to the identity (not necessarily just a strong reference)?
Only unregistered symbols can conceptually be gc'ed. The storage for a registered symbol might be gc'ed, but the identity is forever. Hence, weakmaps cannot use symbols as keys.
How is this different from any object which could be stored into some sort of user land provided global registry and hence never gc'ed? It doesn't stop us from using objects as weakmap keys.
Perhaps we can restrict the consideration here to unregistered symbols then for the time being? I think that is the major use case in discussion.
@guybedford see https://github.com/tc39/ecma262/issues/1194#issuecomment-393143921; there seems to be a desire to either accept all symbols, or none.
Thanks for clarifying - my question was specifically asking if the GC treats Symbol as an equal or if as in Allen's comment at https://github.com/tc39/ecma262/issues/1194#issuecomment-418161074 this creates new overhead for the GC to manage or not. Then based on that answer to come back to this problem of registered v unregistered.
Every GC is different, but in general I would think the easiest approach would be for an engine to simply represent Symbol values internally as a special kind of object. With careful design, most GC algorithms would never have to discriminate between symbol values and objects.
Thanks @allenwb. If that is the case, then it might well be worth solving the registered v unregistered concerns here. Implementer feedback would be ideal though I guess on the above.
How is this different from any object which could be stored into some sort of user land provided global registry and hence never gc'ed? It doesn't stop us from using objects as weakmap keys.
Were we to waive the restriction for registered symbols, we should do so as well for all primitives: strings, numbers, booleans, null, undefined. We could allow them as keys in a weakmap just as we allow them as keys in a normal map.
However, there's an expectation that a key in a weakmap, if not otherwise retained by runtime state, can be unobservably collected. Were we to allow, for example, strings as keys in a weakmap, and someone generates a large randomish string that they are confident does not otherwise occur in the runtime state, many programmers would expect a weakmap to collect it. Analogs to weakmap that I am aware of in other languages would collect these --- which they can because they provide no guarantee of unobservability. Instead, for such primitive values, programmers would need to learn that the weakmap acts map-ish, not weakmap-ish.
Objects can generally be unobservably collected because they have fresh identity, and key comparison uses identity. Once an object is gone, it is not coming back.
@erights can you speak to https://github.com/tc39/ecma262/issues/1194#issuecomment-480044819 - it seems like global symbols could still be unobservably collected if nothing retained an identity reference to the symbol.
(non-symbol primitives are different, of course, because there's a syntactic way to reconstruct the identity - symbols do not have that)
How is this different from any object which could be stored into some sort of user land provided global registry and hence never gc'ed? It doesn't stop us from using objects as weakmap keys.
To be pedantic: The global symbol registry is the only possible global registry in JavaScript. Everything else is created as part of some realm and can be gc'ed if the realm as a whole becomes unreachable. This is important.
The global symbol registry exists in the spec only because it cannot and need not exist in practice. We carefully designed it so that it does not provide any communications channel, or stateful coupling of any kind, between subgraphs that are otherwise isolated. It has always bothered me that the spec describes it as if it is a universally shared mutable table, when it fact it implies no new mutable state whatsoever. Here's an alternate account which is observably equivalent, but omits any registry or mutation of any kind. It does not create any misleading appearance of connectivity between otherwise isolated realms:
An unregistered symbol is exactly as it is now.
A registered symbol is detectably distinct to the internal algorithms from an unregistered symbol. We could do this with another internal slot, or we could encode it in the type. I'm indifferent between these but let's try the second:
Symbol internally has two subtypes, UnregisteredSymbol and RegisteredSymbol.
The Symbol constructor at https://www.ecma-international.org/ecma-262/9.0/index.html#sec-symbol-description is identical except for this renaming:
19.4.1.1 Symbol ( [ description ] )
When Symbol is called with optional argument description, the following steps are taken:
1. If NewTarget is not undefined, throw a TypeError exception.
2. If description is undefined, let descString be undefined.
3. Else, let descString be ? ToString(description).
4. Return a new unique UnregisteredSymbol value whose [[Description]] value
is descString.
The Symbol.for operation would be similar to the Symbol constructor, except that it only allows string descriptions, and that it makes a RegisteredSymbol rather than an UnregisteredSymbol:
19.4.2.2 Symbol.for ( description )
When Symbol.for is called with argument description, the following steps are taken:
1. Let descString be ? ToString(description).
2. Return a new unique RegisteredSymbol value whose [[Description]] value
is descString.
We need a corresponding adjustment for Symbol.keyFor
19.4.2.6 Symbol.keyFor ( sym )
When Symbol.keyFor is called with argument sym it performs the following steps:
1. If Type(sym) is not Symbol, throw a TypeError exception.
2. If Type(sym) is UnregisteredSymbol, return undefined.
3. Assert: sym is a RegisteredSymbol.
4. Return sym.[[Description]]
Finally we get to the punchline. Assuming all the equality comparison operators bottom out in SameValueNonNumber https://www.ecma-international.org/ecma-262/9.0/index.html#sec-samevaluenonnumber , we split its step 7 as follows:
7. If Type(x) is UnregisteredSymbol, then
a. If x and y are both the same Symbol value, return true;
otherwise, return false.
8. If Type(x) is RegisteredSymbol, then
a. If x.[[Description]] and y.[[Description]] are exactly the same sequence
of code units (same length and same code units at corresponding indices),
return true; otherwise, return false.
IOW, UnregisteredSymbols are fresh. RegisteredSymbols, like strings, compare only based on their contents. Those contents are their [[Description]] strings, whose contents are then compared.
@ljharb asks:
can you speak to #1194 (comment) - it seems like global symbols could still be unobservably collected if nothing retained an identity reference to the symbol.
(non-symbol primitives are different, of course, because there's a syntactic way to reconstruct the identity - symbols do not have that)
Let's take strings first. It doesn't matter whether there's a syntactic way to reconstruct them. What matters is that they can be reconstructed, whether statically or at runtime. Likewise, a registered symbol can always be reconstructed at runtime.
This is best demonstrated by falsifying the counter-example you might have in mind:
const wm = new WeakMap();
let houdini = Symbol.for('houd' + 'ini');
wm.set(houdini, 'harry');
houdini = null;
// The houdini registered symbol is only reachable as a key from the weakmap.
// Possible gc.
console.log(wm.get(Symbol.for('hou' + 'dini'));
This might print 'harry'. Therefore it must print 'harry'. Registered symbols are no more collectable than strings.
@erights
I think what we see in this thread and in other discussions about symbol keys in WeakMaps is that most developers have different intuitions about symbol values than they have about other primitive values. In particular, they see the "identify" of primitives such as (numbers, strings, booleans) to being linked to some intrinsic value semantics of the entities rather than the point of origin(creation) of the entity. The identities of symbols and objects are defined by their point of origin. In other words, objects and symbols must be explicitly instantiated (by some agent) while values of the other primitive types have a perpetual existence.
Because objects and symbols are both instantiated devs assume a symbol that becomes unreachable ceases to exist just like an object would. Hence, the intuition that it should be possible to use a symbol as a key of a WeakMap.
In reality, if a symbol isn't globally registered it is, for most purposes, replaceable with an object created as Object.freeze{Object.create(null)). So, a dev who wants to use a symbol as a WeakMap key could simply switch to using an object in place of the symbol. But that pretty quickly turns in into an argument that you never need to use a symbol unless it is going to be globally registered. So why do we have Symbols. They're really a connivence provided because devs probably wouldn't routinely code Object.freeze{Object.create(null)). But that equivalence between symbols and objects leds devs to (I think reasonably) expect that symbols should usable as WeakMap keys.
I agree that this expectation is the source of the confusion. The reason we introduced symbols anyway was to have a new namespace for property names, which Object.freeze{Object.create(null)) can't do.
Thus, no matter what we do, we will cause a surprise. Rather than follow the principle of least surprise, I'll go ahead and coin a term for what we've been doing instead: The principle of least damaging surprise. We've often said "static errors are better than dynamic errors" for this reason. Likewise, reliable and earlier errors are better than unreliable or later errors. We have a choice among three surprises:
Besides all the other arguments made above, note that the last of these surprises is the one most likely to be caught at development time, rather than mysteriously misbehaving in production deployment.
@erights
Registered symbols are no more collectable than strings.
True. But, an unregistered symbol is just as collectable an an object. And I strongly suspect that must usages of symbols that are unregistered.
What risk (or attached surface, if you prefer) is created by allowing registered symbols to be used as weakmap key. The only one I can think of is potential resource exhaustion caused by permanent retention of large weakmapped values. But is that sort of resource exhaustion potential always exists.
Then why not allow all values as weakmap keys? IOW, the first surprise choice above.
Then why not allow all values as weakmap keys?
Because using a string or number as a WeakMap key is almost certainly a bug while using a symbol almost certainly isn't. Now this conjecture is based upon my belief that globally registered symbols are rarely used and when they are used they are unlikely to be used as WeakMap keys. It would be good if we had some usage data. It is certainly possible that some devs are registering lots of symbols that don't need to be.
But the third choice (status quo) seems reasonable except for the intuition problem. That's how we arrived at the current design in the first place. I think we need more examples of real use cases such as in https://github.com/tc39/ecma262/issues/1194#issuecomment-417983496 .
Then why not allow all values as weakmap keys? IOW, the first surprise choice above.
Because, while symbols can be permanent, they're usually not (I share the same suspicion as @allenwb). Strings, numbers, and all the other primitives are always permanent.
Going back to your example in https://github.com/tc39/ecma262/issues/1194#issuecomment-480097587:
const registry = [];
Object.for = function(description) {
for (const { name, value } of registry) {
if (name === description) return value;
}
const value = Object.freeze(Object.create(null));
registry.push({ name: description, value });
return value;
}
console.log(wm.get(Object.for('hou' + 'dini'));
Besides the spec level details that I just can't recreate, I see this as exactly analogous to Symbol.for. And just because I _can make_ permanent objects doesn't mean that we should prevent them from being used as keys. I feel the same for symbols.
WeakMaps should accept any unforgeable key. Providing a global registry doesn't violate forgeability. The symbol returned by Symbol.for wasn't forged, it was the string value passed as an argument that was forged.
You cannot make permanent objects in JavaScript. Your registry above would not generally outlive the realm it was created in. By contrast, values are truly permanent.
A symbol returned by Symbol.for is forged in exactly the same sense that a string returned by string concatenation is forged.
I can install the exact same Object.for function instance in every realm. Again, I can't replicate all spec details in sample code, but imagine this is a permanent registry. The only difference between Symbol.for and my Object.for is that one is already provided by the language.
Using that, I still wouldn't argue that every object key should be permanently stored in WeakMaps. You footgun was using the registry to begin with, it's not the storing the permanent key in the map.
You cannot make permanent objects in JavaScript.
If you have a way to create a permanent realm you can make permanent objects.
@erights am I correct in assuming that your primary concern is that registered symbols created by a transient realm, today could be deallocated when the realm goes away if the symbol is otherwise unreferenced after collecting the realm. Allowing registered keys in weakmaps would preclude that.
@erights am I correct in assuming that your primary concern is that registered symbols created by a transient realm, today could be deallocated when the realm goes away if the symbol is otherwise unreferenced after collecting the realm.
No. Unless there's some equivalence I'm not seeing. My message above with the three surprises really does express my genuine concern: We have a choice between unpleasant surprises. The status quo seems less unpleasant than the alternatives raised here.
We have a choice among three surprises:
- Allow all values, including numbers, strings, symbols, to be used as weakmap keys and never collected. This will cause huge memory or out-of-memory conditions when programmers thought they were using weakmaps correctly --- patterns that would have been correct in other languages.
- Allow only unregistered symbols, creating a surprise, often at runtime, when we reject a registered symbol. I do not know of any other context in the language where one kind of symbol is accepted and another rejected.
- The status quo, in which it is surprising that unregisted symbols are rejected.
I'd be curious to see which/how engines which do not back symbols by gc
based objects and instead use potentially copy based tagged types feel
about all this before jumping to conclusions about options based upon
treating implementation of symbols as object based.
On Thu, Apr 4, 2019, 10:21 PM Richard Gibson notifications@github.com
wrote:
We have a choice among three surprises:
- Allow all values, including numbers, strings, symbols, to be used as
weakmap keys and never collected. This will cause huge memory or
out-of-memory conditions when programmers thought they were using weakmaps
correctly --- patterns that would have been correct in other languages.- Allow only unregistered symbols, creating a surprise, often at
runtime, when we reject a registered symbol. I do not know of any other
context in the language where one kind of symbol is accepted and another
rejected.The status quo, in which it is surprising that unregisted symbols
are rejected.Allow all values for which SameValueNonNumber compares by
reference, corresponding to author perception of non-forgability and
drawing no distinction between registered and unregistered symbols.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/tc39/ecma262/issues/1194#issuecomment-480135639, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAOUo0TXmldIqPZGvepeukG_Jv84VbuCks5vdsFWgaJpZM4UADYc
.
quick breakdown:
v8 (HeapObject), chakra (RecyclableObject), jsc (JSCell), and sm (gc::TenuredCell) all represent symbols as heap objects. moddable doesn't have as clear an inheritance system but i couldn't find any evidence that their symbols are collected, they appear to just be flipping a flag on a stack struct.
p.s. thank you mark for that new mental model on registered symbols, it has made registered symbols in my engine quite a bit faster to work with.
To recap:
Does this seem correct? If we're at an impasse here I will close this.
@devsnek Could you (or anyone else) address the fear about memory leaks given that anyone can create objects at global scope and use them as WeakMap keys? i.e. this is a fear that is not inherently tied to symbols, and is neither more nor less difficult to avoid as an issue whether one is dealing with global objects or with global symbols. If you write bad code, you're gonna have a bad time no matter what you do.
@axefrog is perfectly correct.
@axefrog
Could you (or anyone else) address the fear about memory leaks given that anyone can create objects at global scope and use them as WeakMap keys?
The global scope is per realm. A realm can be gc'ed.
@devsnek, we are indeed at that impasse. I am fine with you closing this.
Well, I’d personally go with allowing all symbols, since the WeakMap and all its properties will be gc’ed once the realm is closed.
That said, a global object can be gc’ed if it’s assigned to mutable properties and variables and all of them get deleted or cleared, whereas a registered symbol won’t be gc’ed even if all references to it in all realms were to be cleared, and you’d need to gc the WeakMap itself.
If there were a separate primitive type where all instances were unforgeable, would it be acceptable to use that as a WeakMap key?
@littledan
Is that a question about making a new type specifically that it can be held in weak collections? If so, that might be better in a different issue. Maintaining the 3 designs above written about by @erights would place some constraints on things, but I do think that having a non-forgeable primitive type would need to have a clear value. Currently you can do Object.freeze({__proto__:null}) for many cases in which creating this in a WeakMap needs a reference type without exposing mutability on that key.
Yes, I agree that primitive types need to meet a very high bar to be added to the language. The value would be specifically be for this purpose--so that contexts which can only hold primitives may be used to reference something via a WeakMap (c.f., RefCollections).
Symbol _is_ that type. Symbols are not forgeable, although there is a built-in (but deniable) cross-realm facility for mapping forgeable strings to Symbol instances.
@gibson042 I'd prefer to use Symbol for this too, but it sounds like others in this thread are saying, Symbol cannot be used largely because some of them are forgeable.
I'd rather not have this than introduce a new primitive.
Perhaps we should go down the path about talking of categorizing the keys based upon for forgeability. It is mentioned above that it is a surprise, but if the surprise is ~ the same level as the current status quo, but it allows for more use cases I think we should take a serious look. That is, I think we should revisit the ability to use non-forgeable Symbols as keys if the alternative is a new primitive.
Two possible ways that we could go about this weakening:
Symbol.create() or something) which could be used as WeakMap keys (so you know what to dereference/so that the semantics of existing Symbols is completely unchanged). This could also solve the representation problem: only these new symbols would need a representation which permits this kind of tracking.I think both of those options have been discussed above to some degree, under the idea that there shouldn't be some symbols that can be weakmap keys and some that can't.
@devsnek Yes, I understand that this is a disadvantage; how should we weigh the cost of that against the cost of not having some kind of capability in this area? (Making a separate, parallel type was another way of avoiding the mismatch.) It feels to me, overall, like this is a solvable problem, and we just have various tradeoffs that we can make about the solution.
@littledan i think its less about specific tradeoffs and more that we're in deadlock (some people don't want all symbols, some people don't want only specific kinds of symbols)
Lots of times, you can get through deadlock by considering the whole space, the value of the proposal overall, and weighing it against the cost of various alternatives. We've gotten through various controversial/deadlocked things at TC39 before this way.
Overall, I think Symbols as WeakMap keys would be a very useful base for being able reference objects from primitives (in the context of the Records and Tuples proposal), by indirecting through the WeakMap.
The implication chain goes like this:
=== does deep comparison=== is a reliable operation, so it's not something that Proxy or Symbols or operator overloading could trap=== semantics which are not based on object identitySymbols as WeakMap keys seems simpler than adding a "box" type, even though we don't need to use these things as property keys.
Such a system would preserve our typical invariants protecting membranes/ocap systems, since object operations are used to access the WeakMap. While it would be ergonomically nice and composition-promoting to have a single built-in mapping from Symbols (or some other primitive) to objects, this wouldn't meet ocap goals, since it would constitute built-in shared state, providing a cross-compartment communication channel for multiple compartments sharing the same frozen Realm containing all of TC39's unmodified primordials.
The open questions I see are:
Symbol.iterator are similar to primordials like Object.prototype and Array.prototype, and registered Symbol.for() symbols are similar to properties of the global object. Just because these will stay alive doesn't mean we disallow them as WeakMap keys.I hope we can work out a solution here among the available options.
I plan to propose "Symbols as WeakMap keys" for Stage 1 at the June 2020 TC39 meeting. By going through the stage process, I hope we can develop consensus on answers to these questions little by little.
Most helpful comment
Overall, I think Symbols as WeakMap keys would be a very useful base for being able reference objects from primitives (in the context of the Records and Tuples proposal), by indirecting through the WeakMap.
The implication chain goes like this:
===does deep comparison===is a reliable operation, so it's not something that Proxy or Symbols or operator overloading could trap===semantics which are not based on object identitySymbols as WeakMap keys seems simpler than adding a "box" type, even though we don't need to use these things as property keys.
Such a system would preserve our typical invariants protecting membranes/ocap systems, since object operations are used to access the WeakMap. While it would be ergonomically nice and composition-promoting to have a single built-in mapping from Symbols (or some other primitive) to objects, this wouldn't meet ocap goals, since it would constitute built-in shared state, providing a cross-compartment communication channel for multiple compartments sharing the same frozen Realm containing all of TC39's unmodified primordials.
The open questions I see are:
Symbol.iteratorare similar to primordials likeObject.prototypeandArray.prototype, and registeredSymbol.for()symbols are similar to properties of the global object. Just because these will stay alive doesn't mean we disallow them as WeakMap keys.I hope we can work out a solution here among the available options.
I plan to propose "Symbols as WeakMap keys" for Stage 1 at the June 2020 TC39 meeting. By going through the stage process, I hope we can develop consensus on answers to these questions little by little.