Hi, Kyle. I have a question about this line in ch5.md
:
"Inheritance" implies a copy operation, [...]
This seems inaccurate to me. Creating objects doesn't trigger a copy operation, at least in the OO languages I'm familiar with. (I'm thinking about the implementation, so I might be missing the point!)
While the content around this line does a great job of explaining how JS does things, I think it gives readers a false impression of how it works in other languages. (In fact that is why I'm here—I ran into a programming teacher who read the page and came out with a good understanding of prototype chains but no understanding of what exactly is different in C# or Python.)
The intent of this statement is more about the concept of "inheritance" (mostly as the notion comes from C++ or Java) than the implementation of it in any specific language. The reason I use Java/C++ as the "standard" definition of classes and inheritance is that by and large, most people that I encounter that come to JS from some other OO language do come from that lineage. So how classes work in those languages dominates the thinking.
Certainly other OO languages (smalltalk, ruby, etc) have different behaviors, so Java/C++ don't "own" the sole definition of inheritance. But IME they have a strong influence on what people expect (and don't expect) out of a class system.
One of the most common metaphors -- speaking only from my experience in CS degrees and from my reading of CS literature -- used to describe instantiation is the blueprint/building. In that case, the "relationship" between blueprint and building is fixed/static. That is, if you build a building, and then alter the blueprint (add a door), it doesn't retroactively change the physical building. Moreover, if you change the building (knock down a wall), it doesn't modify the blueprint. So in essence, the characteristics were copied from blueprint to building. There's no "live link" between them.
Also IME, one of the most common metaphors for parent-child class inheritance is genetic DNA, which again is fixed/static in its very nature. My son got a copy of my DNA when he was created, so our "relationship" was more a copy and not a live linkage. If he breaks his leg, my leg isn't broken, and if I lose my hair, (hopefully) he doesn't lose his.
In Java and C++, inheritance is essentially static. There is no "retroactive inheritance" where if a class is changed after instantiation, the instances (or derived classes) are affected. I think that's why these metaphors are often used in CS education centered around these languages -- it certainly was for both times I went through a CS program.
Because the JS prototype chain is not a copy operation, either conceptually or physically, it often leads to a common confusion (or surprise, at the very least) that it seems to have this "retroactive inheritance" (similar to Ruby or Python). I would also label this behavior "dynamic inheritance" as compared to the "fixed/static" kind of inheritance we see in Java and C++.
In my observations, many in JS (and surrounding dynamic languages) have dealt with the unwanted surprise of retroactive inheritance when using the prototype chain by instead mimicking the static nature of Java/C++ inheritance as literally copying properties from one object to another. This seems to be one, but certainly not the only, motivation for the "mixin" pattern, for example.
To summarize:
That's the motivation behind the statement and sections you cite.
I regret that it misleads people into thinking that all inheritance is that way. That was not my intent. I have plans to significantly revise the text in the second edition to try to make these distinctions more clear.
I should say, I do think "dynamic inheritance" (aka prototypal inheritance, etc) systems would be much better served to use a different term than "inheritance", such as "delegation", to more clearly indicate the live link nature as opposed to the fixed/static nature.
Thanks for the very thorough explanation. It's great to see your thinking on this.
I agree, the blueprint-building analogy is quite common (for better or worse), and even more importantly, the idea of a one-time transfer is present in the everyday meaning of the word inheritance. These are misleading analogies, and it makes sense to address them. But the emphasis on "copying" unfortunately reinforces a different misleading part of the same analogy. So: yes to debunking. More of that! :)
Up to a point, all of these languages work the same. In C++, Java, C#, Python, Ruby, Smalltalk, and JS, each object has a link to some metadata that includes its methods and other class info (whether it's called a "vtable", "class object", "prototype", or "hidden class"). What's different about JS is that everything's exposed—in a way that's simple enough to be worth learning and using.
JS programmers reading your book are investing the time to learn prototypes. They might as well get the reward of knowing that other languages are doing something similar—just behind the scenes.
Closing, since I've put in my 2 cents twice now :)
@jorendorff Just as a point of curiosity, since it seems like you're more knowledgable than I am: In C++, if a chain of class inheritance has no virtual methods, are all the methods still maintained separately (in their different levels) in a vtable or flattened out to a single definition?
I believe I recall a CS professor suggesting that the vtable was only maintained for virtual methods, and otherwise non-virtual methods (actually, pointers to them) were, more or less, actually physically copied down to the instance, for optimization purposes. But it's entirely possible that: my recollection/understanding is wrong, the professor was wrong, or this was the case at one point in C++'s history but isn't anymore, etc.
Always happy to talk about this stuff.
If a C++ class hierarchy has no virtual methods, then it has no vtables. Information about its methods, then, simply isn't available at run time at all. If you call a method on one of those classes, the compiler has to decide which method's being called at compile time. At run time it'll be too late to decide.
struct Cow {
void talk();
};
int main() {
Cow daisy; // C++ knows this is a Cow object...
daisy.talk(); // ...so this calls Cow::talk()
return 0;
}
On my computer, if I compile that (putting the body of Cow::talk()
in a separate file, to keep the compiler from optimizing things away), the compiled code for main()
includes this instruction:
-> 0x100000a5c <+12>: callq 0x100000a70 ; Cow::talk at cowtalk.cpp:7
The 0x100000a70
part is the address of the method Cow::talk()
; callq
is the instruction that does function calls on x86_64. You can see that the method address isn't part of the object — it's embedded in the machine code.
I believe I recall a CS professor suggesting that the vtable was only maintained for virtual methods, and otherwise non-virtual methods (actually, pointers to them) were, more or less, actually physically copied down to the instance, for optimization purposes.
The first part is true: only virtual methods have vtable entries. But methods never took up space in C++ objects. I'm sure about this, because C++ programmers are freakishly dependent on objects being cheap. And they are: sizeof(Cow)
in the above program is 1
. That is, it's 1 byte—since there's nothing in it that requires any memory: no fields, no base classes, no virtual methods.
Thanks for that fantastic extra detail. Really appreciate it!
Most helpful comment
Always happy to talk about this stuff.
If a C++ class hierarchy has no virtual methods, then it has no vtables. Information about its methods, then, simply isn't available at run time at all. If you call a method on one of those classes, the compiler has to decide which method's being called at compile time. At run time it'll be too late to decide.
On my computer, if I compile that (putting the body of
Cow::talk()
in a separate file, to keep the compiler from optimizing things away), the compiled code formain()
includes this instruction:The
0x100000a70
part is the address of the methodCow::talk()
;callq
is the instruction that does function calls on x86_64. You can see that the method address isn't part of the object — it's embedded in the machine code.The first part is true: only virtual methods have vtable entries. But methods never took up space in C++ objects. I'm sure about this, because C++ programmers are freakishly dependent on objects being cheap. And they are:
sizeof(Cow)
in the above program is1
. That is, it's 1 byte—since there's nothing in it that requires any memory: no fields, no base classes, no virtual methods.