Hi,
You Dont Know JS
is definitely a great resource for learning how JavaScript works, thanks so much for sharing this.
When I learned the concept about compilation phase
and how hoisting works, I got little bit confused, I would appreciate it if you could shed a light on it.
In Chapter 1: What is Scope?, there is a paragraph about compiler:
It may be self-evident, or it may be surprising, depending on your level of interaction with various languages, but despite the fact that JavaScript falls under the general category of "dynamic" or "interpreted" languages, it is in fact a compiled language. It is not compiled well in advance, as are many traditionally-compiled languages, nor are the results of compilation portable among various distributed systems.
When you said it is in fact a compiled language
, does it mean:
(1) Most of the JavaScript engine has a compiler inside
or
(2) All JavaScript engine must has a compiler inside
I think it's first one, because we can write a JavaScript interpreter without compiler inside, so compiler is not mandatory for JavaScript engine.
And in Chapter 4: Hoisting, there is a paragraph about how hoisting works:
To answer this question, we need to refer back to Chapter 1, and our discussion of compilers. Recall that the Engine actually will compile your JavaScript code before it interprets it. Part of the compilation phase was to find and associate all declarations with their appropriate scopes.
When you explain how hoisting works in JavaScript, you are actually talking about how it works on a specific(or the most common) JavaScript engine
right? It's related to my first question, because we can imagine there could be a JavaScript engine without compiler, so there is no compilation phase at all.
So when you said the term compilation phase
, it means you already assumed the JavaScript engine has compilation phase.
If you look at the ECAMScript(I took 3rd edition for example, not the latest version because I just want to be simplified), there is a paragraph about how hoisting works(10.1.3 Variable Instantiation):
Every execution context has associated with it a variable object. Variables and functions declared in the source text are added as properties of the variable object. For function code, parameters are added as properties of the variable object.
Which object is used as the variable object and what attributes are used for the properties depends on the type of code, but the remainder of the behaviour is generic. On entering an execution context, the properties are bound to the variable object in the following order:
There are two phases for processing the execution context code:
ECMAScript spec says nothing about compile
which makes sense to me, because it should not describe any implementation details.
To sum up, when we are taking about how hoisting works, can we say that there are two answers from different perspective?
From language spec perspective, on entering the execution context, the variable object
are created and variable declaration create a property on it and initialized with undefined
. That's why in the code execution phase we can access this variable.
From implementation perspective, for common JavaScript engine, all declarations, both variables and functions, are processed first so it explains how hoisting works.
Thanks.
When you said it is in fact a compiled language, does it mean:
(1) Most of the JavaScript engine has a compiler inside
or
(2) All JavaScript engine must has a compiler insideI think it's first one, because we can write a JavaScript interpreter without compiler inside, so compiler is not mandatory for JavaScript engine.
The spec requires early errors (not just syntax errors, but others) which requires parsing the entire program before executing any of it. Parsing is step 3 of the 4 steps of classical compilation (lexing, tokenization, parsing, and code generation). The result of parsing is generally an abstract syntax tree (AST).
The spec does not require that the 4th step be done in the strictest sense, but I can't imagine any sensible approach (performant, etc) that would get all the way to an AST, and then go back to the original JS code and execute it from JS interpreted line-by-line. That would be silly at best.
All engines take that AST and then turn it into instructions that can actually do stuff on the system. Those instructions are almost always some sort of binary intermediate representation (IR) of the program, which you could roughly think of as a byte code. That IR is handed off to some part of the engine that can "interpret" those instructions into machine instructions. That part is usually thought of as a JS virtual machine (and in some cases, literally called that).
When you take all of this into account, and consider the spirit of what's happening, JS is far closer in spirit and in practice to compiled (in the traditional sense) languages than to interpreted (in the traditional sense) languages.
JS is a compiled language.
ECMAScript spec says nothing about compile which makes sense to me, because it should not describe any implementation details.
Everything you've cited is indeed related to the execution phase, which happens after the first phase (processing, compilation, whatever you want to call it) has occurred. None of that stuff changes what happens during compilation, nor implies whether compilation happens or not. It only implies that there must in fact be a processing phase before execution. We can quibble about what to call that phase, but it's undebatable that this phase must happen.
All the stuff about "entering execution contexts" and "adding to variable object" and all that... that's all entirely consistent with steps for execution of any representation of code; none of that requires that the execution is literal line-by-line interpretation of actual JS source code. It's entirely plausible for all of that stuff to apply to executing/interpreting byte-code-ish representations that were code-generated after JS was parsed.
In fact, there's a pretty strong implication of that processing having happened, even in that language cited, that it already knows about all the variables in the scope, no matter where they've been declared, even if on the last line of a scope.
By contrast, you don't see in any of that wording about it adding variables to the scope whenever it encounters var
statements, as if processing line-by-line. You see basically "add all of them at once". Ask yourself... how would a JS engine know what variables are in the scope if it hadn't first parsed and processed that scope?
It just doesn't stand up to any reason to think about what JS is doing, and come up with the conclusion that this is all "more like interpretation" than it is like "compilation". The latter is the only sensible conclusion, IMO.
Thanks so much for the prompt reply, it's clear.
Most helpful comment
Everything you've cited is indeed related to the execution phase, which happens after the first phase (processing, compilation, whatever you want to call it) has occurred. None of that stuff changes what happens during compilation, nor implies whether compilation happens or not. It only implies that there must in fact be a processing phase before execution. We can quibble about what to call that phase, but it's undebatable that this phase must happen.
All the stuff about "entering execution contexts" and "adding to variable object" and all that... that's all entirely consistent with steps for execution of any representation of code; none of that requires that the execution is literal line-by-line interpretation of actual JS source code. It's entirely plausible for all of that stuff to apply to executing/interpreting byte-code-ish representations that were code-generated after JS was parsed.
In fact, there's a pretty strong implication of that processing having happened, even in that language cited, that it already knows about all the variables in the scope, no matter where they've been declared, even if on the last line of a scope.
By contrast, you don't see in any of that wording about it adding variables to the scope whenever it encounters
var
statements, as if processing line-by-line. You see basically "add all of them at once". Ask yourself... how would a JS engine know what variables are in the scope if it hadn't first parsed and processed that scope?It just doesn't stand up to any reason to think about what JS is doing, and come up with the conclusion that this is all "more like interpretation" than it is like "compilation". The latter is the only sensible conclusion, IMO.