Njs: WIP: import module (part 1)

Created on 19 Jan 2019  ·  99Comments  ·  Source: nginx/njs

Hi @xeioex @igorsysoev

I try to start VM on compilation phase. Below is my understand.

  1. Dsl (njs) is a tool to help us write c-module. It also has two phases: config and runtime.
    At the configuration phase, we prepare all the stuff, especially define module/function (In c module and function in js).
    At the runtime phase, these different requests will find their corresponding module/function and execute.

  2. So in NJS, the most important thing is defining the function (it takes effect by njs_vm_start()).
    When a complete js code is executed, the side effect is global scopes values are set, it doesn't affect the lambda functions. We can start VM then clone new VM. It's safe to copy all the lambda function from the initial VM to new VM since the byte code only record the index of variables/values with different scopes.
    vm: njs_vm_start(), save global scopes.
    nvm: copy global scopes values, njs_vm_call().

  3. If we can do it, we can make the request process faster and the 'require' feature seems easier to implement.

This's a huge change, does it make sense?

diff -r 0813395557b6 nginx/ngx_http_js_module.c
--- a/nginx/ngx_http_js_module.c    Fri Jan 18 15:28:17 2019 +0300
+++ b/nginx/ngx_http_js_module.c    Wed Jan 16 14:11:13 2019 +0800
@@ -873,7 +873,6 @@ static ngx_int_t
 ngx_http_js_init_vm(ngx_http_request_t *r)
 {
     nxt_int_t                 rc;
-    nxt_str_t                 exception;
     ngx_http_js_ctx_t        *ctx;
     ngx_pool_cleanup_t       *cln;
     ngx_http_js_main_conf_t  *jmcf;
@@ -913,15 +912,6 @@ ngx_http_js_init_vm(ngx_http_request_t *
     cln->handler = ngx_http_js_cleanup_ctx;
     cln->data = ctx;

-    if (njs_vm_start(ctx->vm) == NJS_ERROR) {
-        njs_vm_retval_to_ext_string(ctx->vm, &exception);
-
-        ngx_log_error(NGX_LOG_ERR, r->connection->log, 0,
-                      "js exception: %*s", exception.length, exception.start);
-
-        return NGX_ERROR;
-    }
-
     rc = njs_vm_external_create(ctx->vm, njs_value_arg(&ctx->args[0]),
                                 jmcf->req_proto, r);
     if (rc != NXT_OK) {
@@ -2394,6 +2384,7 @@ ngx_http_js_include(ngx_conf_t *cf, ngx_
     ngx_memzero(&options, sizeof(njs_vm_opt_t));

     options.backtrace = 1;
+    options.init = 1;
     options.ops = &ngx_http_js_ops;

     jmcf->vm = njs_vm_create(&options);
@@ -2443,6 +2434,14 @@ ngx_http_js_include(ngx_conf_t *cf, ngx_
         return NGX_CONF_ERROR;
     }

+    if (njs_vm_start(jmcf->vm) == NJS_ERROR) {
+        nxt_str_t  exception;
+        njs_vm_retval_to_ext_string(jmcf->vm, &exception);
+        ngx_conf_log_error(NGX_LOG_EMERG, cf, ngx_errno,
+                           "js exception: %*s", exception.length, exception.start);
+        return NGX_CONF_ERROR;
+    }
+
     return NGX_CONF_OK;
 }

diff -r 0813395557b6 njs/njs.c
--- a/njs/njs.c Fri Jan 18 15:28:17 2019 +0300
+++ b/njs/njs.c Wed Jan 16 14:11:13 2019 +0800
@@ -368,6 +368,10 @@ njs_vm_clone(njs_vm_t *vm, njs_external_
         nvm->global_scope = vm->global_scope;
         nvm->scope_size = vm->scope_size;

+        memcpy(vm->global_scope,
+               (u_char *) vm->scopes[NJS_SCOPE_GLOBAL] + NJS_INDEX_GLOBAL_OFFSET,
+               vm->scope_size);
+
         nvm->debug = vm->debug;

         ret = njs_vm_init(nvm);
TEST_NGINX_BINARY=/usr/local/nginx/sbin/nginx prove js*
js2.t ................... ok
js_async.t .............. ok
js_internal_redirect.t .. ok
js_request_body.t ....... ok
js_return.t ............. ok
js_subrequests.t ........ skipped: JSON::PP not installed
js.t .................... ok
All tests successful.
Files=7, Tests=88,  3 wallclock secs ( 0.04 usr  0.01 sys +  0.46 cusr  0.13 csys =  0.64 CPU)
Result: PASS

https://github.com/nginx/njs/issues/85

feature

Most helpful comment

@hongzhidao,

Talking about moving njs_vm_start() to config phase. While it is a good thing to have (if we can solve all the issues), I think it not is essential for module support. That is an optimization, which we can do later.

njs_vm_relink_globals()

I thought about something like this. IMHO, this is a brittle way (a lot of things can go wrong). Moreover, it is not much faster than the current way (imagine relinking and copying large js objects). Also, I do not like the idea, we have to make a copy of global values for each new vm.

Here is an idea, I came up recently:
I think we can simplify the task here. To solve the problem with global objects, we can introduce the following optimization semantic: whenever a value declared as a constant, we try compute its value at the compilation phase. For example:
const a = {a:1, b:1};

currently it is created in runtime with the following code:

00000 OBJECT            60300000CDC0
00024 PROPERTY SET      60300000CD60 60300000CDC0 60300000CD90
00064 PROPERTY SET      60300000CD00 60300000CDC0 60300000CD30

for values declared as consts, and containing only literals (const a = {a: new Date()} is not supported, for example), we are replacing the code, with the njs_object value, where all the properties (configurable:false, writable:false) are in shared_hash. In such a case, it is safe to use the same njs_object across all the vm (because it is immutable). Any other values, are created at run-time (as it is now).
The advantage: it is faster, because no deepcopy and relinking is required. it seems simpler.

Again, moving njs_vm_start() to config phase, make a difference ONLY if you have a lot of global code, otherwise the difference is negligible. Currently, I see only one reason for global code: large object or array literals.

All 99 comments

@hongzhidao

This's a huge change, does it make sense?

I thought about it. It would be a great thing to have, but it seems to me, it will not work (at least, it is not that easy to do). If we did not have global values (except for function definitions), your suggestion would work.

Take a look at the following example:
Here we expect to get the same reply for each new request, but global_array will be the same for each request and results of push() will accumulate.

var global_array = [];

function content(r) {
 global_array.push(1);
 r.return(200, JSON.stringify(global_array));
}

If we make njs_vm_start() at the config phase, and copy global values. We cannot share them.
Probably, we can workaround it (using some form of copy-on-write mechanism), but it not easy also.
Ideas:
1) njs object structure has two hashes, normal and shared. After njs_vm_start() in the config phase, we can move the content of global object into shared_hash. Any changes at runtime will affect only hash structure.

2) js objects, make chains, so a global object from parent vm, can be topmost immutable object, while any changing in function will affect a local object.

If we can do it, we can make the request process faster and the 'require' feature seems easier to implement.

BTW, If you have only function definitions, njs_vm_start() would cost you almost nothing (take a look at njs disassemble output). On the other hand, if you have large objects literals, defined on the global level, njs_vm_start() would build them again and again for each request (see).

What version of nginx tests are you using?

I clone the latest version from http://hg.nginx.org/nginx-tests, I'll check it again. (It works well, I've tried it again.)
Do you also agree that it's a good idea if we can run VM on configuration phase without issues?

var global_array = [];

function content(r) {
 global_array.push(1);
 r.return(200, JSON.stringify(global_array));
}

I think it's not hard to solve. The important thing is how we process global values after executing.
Do you think it's common that developers need to write side effect code in global scope?
If not, it's no any difference to start VM on compilation or runtime phase.
But anyway, I prefer to start VM on compilation if we can solve the problems you mentioned.

Do you think it's common that developers need to write side effect code in global scope?

I am not sure. Probably it is not common. But, at least, I know people who want to define large object literals (as a built-in table for routing, for example). So, we definitely need a way to ensure it works somehow.

Do you also agree that it's a good idea if we can run VM on configuration phase without issues?

I think it is a good idea to try. But, before doing it, we need to understand clearly what limitations we are imposing. would not be these limitations too restrictive?

@hongzhidao

I clone the latest version from http://hg.nginx.org/nginx-tests, I'll check it again. (It works well, I've tried it again.)

I am sorry, I did not copy the last part of the patch. Tests pass. But these tests are too simple:)
There is just one test for global objects.

var global_array = [];

function content(r) {
 global_array.push(1);
 r.return(200, JSON.stringify(global_array));
}
$ curl http://localhost:8000/global
[1] 
$ curl http://localhost:8000/global
[1,1] 
$ curl http://localhost:8000/global
[1,1,1] 
$ curl http://localhost:8000/global
[1,1,1,1]

Do you think it's common that developers need to write side effect code in global scope?

I recalled, currently we use global variables as a way to share data between js_content handler and nginx njs variable handlers (see).

Dsl (njs) is a tool to help us write c-module. It also has two phases: config and runtime.
At the configuration phase, we prepare all the stuff, especially define module/function (In c module and function in js).

BTW. The idea for the future we discussed with @arut: The ability to register njs hooks for nginx phases.
Something like (definitely this code should be executed at configure phase):

nginx.register('access', myfunc);
nginx.register('preread', myfunc2);

Hi @xeioex

Take a look, it's a draft.
https://gist.github.com/hongzhidao/a42149acc5d0c1411aab25dda1b51354

There are some thoughts and questions.

  1. We should save the global values after njs_vm_start() at configuration phase.
    The vm->global_values, vm->global_size keep the global values like follow.
    [global values] + [complete arrays associated in global values] + [complete objects associated in global values]...
    It's a continuous memory block that will be conveniently copied.
    But we need relink the pointer at runtime, so we need to calculate the offset based on the start of the memory block.
  2. Relink the value's pointer whose type is a reference such as array, object. And we should be able to process the following case.
    var a = {}
    var b = {}
    a.x = b
    b.y = a
  3. 3.

Ideas:
njs object structure has two hashes, normal and shared. After njs_vm_start() in the config phase, we can move the content of global object into shared_hash. Any changes at runtime will affect only hash structure.
js objects, make chains, so a global object from parent vm, can be topmost immutable object, while any changing in function will affect a local object.

Good suggestion, but t may affect the result, such as

var g = [1, 2]
function foo() { g.push(3) }

I'm not sure.

  1. Igor suggests, at least to begin with, not to run modules code during the compilation phase, only compile it.

Does he mean, keep the current way: compile at configuration, start at runtime, including the required module body? This way can avoid the problems introduced the new way we discussed.

  1. Do you think it's possible developers can avoid performance problem by using correct way? Such as write code (not function) in global scope as few as possible.

  2. 6.

BTW. The idea for the future we discussed with @arut: The ability to register njs hook for nginx phases.
Something like (definitely this code should be executed at configure phase):
nginx.register('access', myfunc);
nginx.register('preread', myfunc2);

Seems good.
Can you give a complete example? I don't know where the codes (nginx.register) are placed.

@hongzhidao,

Talking about moving njs_vm_start() to config phase. While it is a good thing to have (if we can solve all the issues), I think it not is essential for module support. That is an optimization, which we can do later.

njs_vm_relink_globals()

I thought about something like this. IMHO, this is a brittle way (a lot of things can go wrong). Moreover, it is not much faster than the current way (imagine relinking and copying large js objects). Also, I do not like the idea, we have to make a copy of global values for each new vm.

Here is an idea, I came up recently:
I think we can simplify the task here. To solve the problem with global objects, we can introduce the following optimization semantic: whenever a value declared as a constant, we try compute its value at the compilation phase. For example:
const a = {a:1, b:1};

currently it is created in runtime with the following code:

00000 OBJECT            60300000CDC0
00024 PROPERTY SET      60300000CD60 60300000CDC0 60300000CD90
00064 PROPERTY SET      60300000CD00 60300000CDC0 60300000CD30

for values declared as consts, and containing only literals (const a = {a: new Date()} is not supported, for example), we are replacing the code, with the njs_object value, where all the properties (configurable:false, writable:false) are in shared_hash. In such a case, it is safe to use the same njs_object across all the vm (because it is immutable). Any other values, are created at run-time (as it is now).
The advantage: it is faster, because no deepcopy and relinking is required. it seems simpler.

Again, moving njs_vm_start() to config phase, make a difference ONLY if you have a lot of global code, otherwise the difference is negligible. Currently, I see only one reason for global code: large object or array literals.

@hongzhidao

Does he mean, keep the current way: compile at configuration, start at runtime, including the required module body? This way can avoid the problems introduced the new way we discussed.

yes.

Do you think it's possible developers can avoid performance problem by using correct way? Such as write code (not function) in global scope as few as possible.

I think so. At least it is a way to start. Also we can write a guide on how to write efficient njs code.

Currently, I see only one reason for global code: large object or array literals.

Add: These data is different from code, they are config data, many developers store them in 3rd service.

So your suggestion is split it into two parts?

  1. Keep the current way: compile at configuration, start at runtime, including the required module body? As to Igor's suggesion.
  2. For optimization, will add const keyword to create immutable objects, it's like vm->shared->values_hash?

Add: These data is different from code, they are config data, many developers store them in 3rd service.

yes, most of the time. But, imagine you want fastest routing (going to 3rd service can be slow).

Keep the current way: compile at configuration, start at runtime, including the required module body? As to Igor's suggesion.

exactly. I see it this way: require() becomes a call of anonymous, precompiled function (where a module's body resides), the only difference is: the result of the call is cached.

For optimization, will add const keyword to create immutable objects, it's like vm->shared->values_hash?

yes.

exactly. I see it this way: require() becomes a call of anonymous, precompiled function (where a module's body resides), the only difference is: the result of the call is cached.

As I mentioned before, the required body is an anonymous function. Even more the native module is a contructor function. require('fs') == new Fs(), but it works well now.

@xeioex

------ f.js -----------
var _m = {}

var g = []

_m.foo = function()
{
    g.push(1)
    console.log(JSON.stringify(g))
}

return _m
------ end of f.js -------

var m = require('f.js')
m.foo()

Perform as bellow code.

var f = (function() {
    var _m = {}

    var g = []

    _m.foo = function()
    {
        g.push(1)
        console.log(JSON.stringify(g))
    }

    return _m
})

var m = f()
m.foo()

I'll do it in this way, what do you think?

Hi @hongzhidao !

// var m = require('./m');
// ||
// var m = require('./m.js');

var mdef = (function(exports, require, module, __filename, __dirname) {
// - m.js
    var private0 = {};
    var private1 = [1,2,3];

    exports.pubFn0 = function() {
        return private0;
    };

    exports.pubFn1 = function() {
        return private1;
    };

    exports.pubVar = Array(100).fill(null);
// - m.js
});

var _module = {
    exports: {}
};

mdef(_module.exports, require, _module, './m.js', './');
return _module.exports;

see also https://nodejs.org/api/all.html#modules_exports_shortcut

@drsm Thank you.
Do you prefer to use a module like nodejs?
The purpose of a module is to help developers organize code, I'm used to the style of lua I have shown. It's simple.
So currently I want to implement it as minimal as possible. Then we can add more like nodejs base on it.
Because require is not semantic in JS. How about making it clear and simple?

@hongzhidao
I think is better to follow the POLA principle here :), as we mimic nodejs api.
But we can patch it later of cause...

Why not implement ES6+ import instead of non-standard require?

Why not to implement ES6+ import instead of non-standard require?

I like it. @xeioex, take a look again.

Why not to implement ES6+ import instead of non-standard require?
@hongzhidao @VBart

http://exploringjs.com/es6/ch_modules.html
While ES6 support is a big plus and we ultimately would need them. I am not sure we need to start with it now.
1) The require() way does not require changes to the parser.
2) when we understand how to do require() modules, it will be much easier to add support for import.
The main question now, how to integrate modules with njs.
3) require() interface is already present in njs.
4) if we do not want to promote outdated API (require()) we can avoid publishing it for a while.
5) I hope, that with require() support we will be able to load existing node.js modules (at least, simple ones).

To me, require() and import ways are quite similar, the difference is mainly syntactic.

@hongzhidao
We want to mimic the node.js module format, see:
http://www.tutorialsteacher.com/nodejs/nodejs-module-exports
https://nodejs.org/api/modules.html#modules_exports_shortcut

To illustrate the behavior, imagine this hypothetical implementation of require(), which is quite similar to what is actually done by require():

function require(/* ... */) {
  const module = { exports: {} };
  ((module, exports) => {
    // Module code here. In this example, define a function.
    function someFunc() {}
    exports = someFunc;
    // At this point, exports is no longer a shortcut to module.exports, and
    // this module will still export an empty default object.
    module.exports = someFunc;
    // At this point, the module will now export someFunc, instead of the
    // default object.
  })(module, module.exports);
  return module.exports;
}

Perform as bellow code.
do you want to embed modules' body directly into main block (js_include file)?
Many modules make require() for other modules. We have to avoid inserting multiple instances of the same module. Probably we have to scan the modules.

I see it this way:
at parsing phase, when require token is encountered with 'file.js' argument.
2) open the specified file
2) wrap in around with

(function () { 
    var module = { exports: {} }; 
    (function (module, exports) {
    // Module code here
 })(module, module.exports);
 return module.exports;})

3) here is the part when I am unsure:
compile the code using a separate parser. as output here we want njs_function_lambda_t
4) store the lambda for 'file.js' name in module_hash

at runtime:
1) if require('file.js') is called for the first time, call njs_function_lambda_t and get/cache the result.

My point, we definitely need to support one of the standard existing JS ways to load modules (require() or import). Not something homegrown, otherwise people will be confused. I prefer require(), because, IMHO, it will simpler to start with.

@hongzhidao

The purpose of a module is to help developers organize code, I'm used to the style of lua I have shown. It's simple.

Also it is a way to reuse existing code, which many people prefer. The ability to load existing modules is a huge boon.

I have some thoughts:

  1. require() is easy to use although it's non-standard, and it used to import native function. There's no other way be able to import native functions: fs, crypto.
  2. The standard way import, export is also simple. I guess it'll be welcome.
  3. The maximum value introducing nodejs style is we can integrated nodejs modules, but I'm not sure whether we can achieve it (integrated nodejs modules). I'm worried it may make njs too hard to use.

BTW, I agree the details you described how to implement. It's clear and I think so too.

Just 2-cents:

  • Two of the main features of require in Node.js are: it can load modules dynamically and it scans node_modules directory. That's not the case with njs. So, what's currently implemented, as well as what's proposed to be implemented is somewhat custom.
  • The Node.js team is working on import support: https://nodejs.org/api/esm.html

@hongzhidao

The standard way import, export is also simple. I guess it'll be welcome.

I am OK with import way.

require() is easy to use although it's non-standard, and it used to import native function. There's no other way be able to import native functions: fs, crypto.

as stated in, builtin modules can be loaded as

import { readFile } from 'fs'

@hongzhidao

I think we can start with the following pattern:

const MY_CONST = ···;
function myFunc() {
    ···
}

export { MY_CONST, myFunc };

There is a piece of information I get.

Reference: http://exploringjs.com/es6/ch_modules.html

  1. Each module is a piece of code that is executed once it is loaded.
  2. Imports and exports must be at the top level
  3. import and export statements have no dynamic parts (no variables etc. are allowed).

Try in chrome:

// main.js
console.log('start main');
import './lib.js';
console.log('end of main');
// lib.js
console.log('start lib');
import { baz } from './lib2.js'
baz();
var a = 3;
function foo() {
    console.log('foo in lib, a is ' + a);
}
export { foo }
console.log('end of lib');



md5-94933eaf526696194de0b1b977346c31



// lib2.js
console.log('start lib2');
import { foo } from './lib.js'
foo();
var a = 5;
function baz() {
    console.log('baz in lib2, a is ' + a);
}
export { baz }
console.log('end of lib2');



md5-94933eaf526696194de0b1b977346c31



Result:
start lib2
foo in lib, a is undefined
end of lib2
start lib
baz in lib2, a is 5
end of lib
start main
end of main



md5-2f3dcba5e1e5735235132457b8de6f0d



njs_parser_scope_begin(NJS_SCOPE_MODULE);
njs_parser_scope_end(...);

Parse export keyword:
The export information will be managed by the current scope.
Parse import keyword:
It will behave like njs_parser_variable_reference(vm, parser, node, NJS_REFERENCE);

Finally, we should move this scope to the topmost scope (NJS_SCOPE_GLOBAL) by 'insert queue head'
Then we generate code.

  1. At runtime phase.
    We run the modules first, then run the main script.

Welcome to suggest.

@hongzhidao ,

njs_parser_scope_begin(NJS_SCOPE_MODULE);
The export information will be managed by the current scope.
It will behave like njs_parser_variable_reference(vm, parser, node, NJS_REFERENCE);

I like it, its seems better.

we only need to introduce new scope with NJS_SCOPE_MODULE,

it should be top level scope like NJS_SCOPE_GLOBAL? If so, probably we do not need NJS_SCOPE_MODULE.

In general, I support the suggested way.

OK, I try to implement it first.
And I want to make it simpler by limiting import position.
In the document, 'import' is allowed to used in any module, but I think it's better to put it only in the main script. Thus can avoid cyclic dependencies. Such as:

//main.js
import  'lib.js';
import  'lib2.js';
...
// lib.js
// import is not allowed here
...

@hongzhidao ,

We can start with the limited case, where import is only available in main. But, I think, it should not be much harder to do (we can simply do not support cyclic dependencies).

@xeioex @VBart @drsm

njs_parser_scope_begin(NJS_SCOPE_MODULE);
The export information will be managed by the current scope.
It will behave like njs_parser_variable_reference(vm, parser, node, NJS_REFERENCE);

here is the part when I am unsure:
compile the code using a separate parser. as output here we want njs_function_lambda_t

it should be top-level scope like NJS_SCOPE_GLOBAL? If so, probably we do not need NJS_SCOPE_MODULE.

What we consider are the same.

I have some thoughts, see the following example:

// main.js
import  'a.js';
import  'b.js';

The final result is:

struct njs_parser_s {
    njs_lexer_t                     *lexer;
    njs_parser_node_t               *node;
    njs_parser_scope_t              *scope;
    njs_parser_t                    *parent;
+   nxt_queue_t                     modules; (i)
};
struct njs_parser_scope_s {
+   njs_parser_node_t               *node; (ii)

    nxt_queue_link_t                link;
    nxt_queue_t                     nested;
...



md5-05ecfe173b59649f8bf64d524cdfb7a7



-njs_parser_node_t *
+nxt_int_t
 njs_parser(njs_vm_t *vm, njs_parser_t *parser, njs_parser_t *prev)
 {
     njs_ret_t           ret;
@@ -145,7 +145,9 @@ njs_parser(njs_vm_t *vm, njs_parser_t *p
     node->token = NJS_TOKEN_END;
     node->scope = parser->scope;

-    return node;
+    parser->scope = node;
+
+    return NXT_OK;
 }



md5-94933eaf526696194de0b1b977346c31



    In njs_parser_function_declaration

    parser = njs_parser_function_create(vm, parser);
    if (nxt_slow_path(parser == NULL)) {
        return NJS _TOKEN_ERROR;
    }

    token = njs_parser_function_lambda(vm, parser, function->u.lambda, token);



md5-024a26edb812ca4684780e0778daf86d



njs_generate_scope(njs_vm_t *vm, njs_generator_t *generator, njs_parser_node_t *node);
vs
njs_generate_scope(njs_vm_t *vm, njs_generator_t *generator, njs_parser_scope_t *scope);



md5-dcf6f5b9863a07350efa39fe23812537



loop njs_generate_scope(parser->modules);
njs_generate_scope(parser->scope);

But at runtime phase, it may behave like as usual.

I think the most important thing is introducing nodes in njs_parser_scope_t;

@hongzhidao

It seems we need to refactor njs_parser first, I want to eliminate BLOCK and SHIM,
https://gist.github.com/hongzhidao/c796541819ba6e6ea78d9bf2a1290073

while I like the proposed patch because it simplifies the code. I do not think we should eliminate neither BLOCK nor SHIM scopes. Because, logically they make sense.

1) BLOCK scope is a stub for let and const declarations (let and const has block scope). Tests pass because let and const are not implemented yet.

2) SHIM is a special scope for at least two things:

function expressions:

(function f(args) { 
/* f should be available only here (for recursion, for example), not in the enclosing scopes */ 
})

catch blocks:

try {
...
} catch (e 
/* e is declared in SHIM scope, 
it should not be available in the enclosing scopes, 
it is also different from let e below */) 
{
let e = 1;
}

While it is possible to implement SHIM var using other ways (special flags for a variable, for example), it complicates logic somewhere else.

@hongzhidao

It'll not be required to created new parser, since each scope has nodes, and I think generator will become eaiser.
I think the most important thing is introducing nodes in njs_parser_scope_t;

I am positive about this idea, and about simplification of the code in general.

OK, we keep the two scopes BLOCK and SHIM.
It's too early to change the code before 'let' and 'const' are implemented.
So I'll make njs_parser_scope_t has its own nodes first.

@xeioex

So I'll make njs_parser_scope_t has its own nodes first.

https://gist.github.com/hongzhidao/f9efbe7b0bf939756024675854dd89ce

@hongzhidao

Thanks! Committed with minor changes.

@xeioex

Simplified demo:

import foobaz {
    var a = 1;
    var b = 1;
    var c = 1;
}

import other {
    var a = 1;
    var b = 1;
    var c = 1;
    var d = 1;
}

-d

00000 MOVE              0012 21AF250
00032 MOVE              0022 21AF250
00064 MOVE              0032 21AF250
00096 STOP              21AF260
00000 MOVE              0013 21AF250
00032 MOVE              0023 21AF250
00064 MOVE              0033 21AF250
00096 MOVE              0043 21AF250
00128 STOP              21AF260
00000 STOP              21AF260
typedef enum {
    NJS_SCOPE_ABSOLUTE = 0,
    NJS_SCOPE_GLOBAL = 1,
+   NJS_SCOPE_MODULE = 2,
    NJS_SCOPE_CALLEE_ARGUMENTS = 3,
} njs_scope_t;

In njs_parser_scope_begin

     scope->type = type;

+    if (type == NJS_SCOPE_GLOBAL) {
+        type += NJS_INDEX_GLOBAL_OFFSET;
+
+        parser->nmodules = 0;
+        nxt_queue_init(&parser->modules);
+    }
+
+    if (type == NJS_SCOPE_MODULE) {
+        type += parser->nmodules + sizeof(njs_value_t);
+        parser->nmodules++;
+    }
+
+    scope->next_index[0] = type;
+    scope->next_index[1] = 0;
+
     if (type == NJS_SCOPE_FUNCTION) {
-        scope->next_index[0] = type;
-        scope->next_index[1] = NJS_SCOPE_CLOSURE + nesting
-                               + sizeof(njs_value_t);
-
-    } else {
-        if (type == NJS_SCOPE_GLOBAL) {
-            type += NJS_INDEX_GLOBAL_OFFSET;
-        }
-
-        scope->next_index[0] = type;
-        scope->next_index[1] = 0;
+        scope->next_index[1] += NJS_SCOPE_CLOSURE + nesting
+                                + sizeof(njs_value_t);
     }

In njs_vm_init

    vm->scopes[NJS_SCOPE_GLOBAL] = (njs_value_t *) values;
    memcpy(values + NJS_INDEX_GLOBAL_OFFSET, vm->global_scope,
           vm->scope_size);

ADD:
    i = 0;

    for (lnk = nxt_queue_first(&vm->modules);
         lnk != nxt_queue_tail(&vm->modules);
         lnk = nxt_queue_next(lnk))
    {
        generator = nxt_queue_link_data(lnk, njs_generator_t, link);

        values = nxt_mem_cache_align(vm->mem_cache_pool, sizeof(njs_value_t),
                                     generator->scope_size + sizeof(njs_value_t));

        if (nxt_slow_path(values == NULL)) {
            return NXT_ERROR;
        }

        memcpy(values, generator->local_scope, generator->scope_size);

        vm->scopes[NJS_SCOPE_MODULE + i] = (njs_value_t *) values;

        i++;
    }

This is the first draft. I don't want to put MODULE and GLOBAL together, though it seems next_index[1] of global scope can be used. It's similar the closure, and I also don't want to regard module as an anonymous function, it's not clear enough now.

As usual, I need to refactor code first, such as 'scope_size, global_scope, etc'. It becomes kind of confusing after introducing module.

@hongzhidao

As usual, I need to refactor code first, such as 'scope_size, global_scope, etc'. It becomes kind of confusing after introducing module.

Agree. With refactored and clean code it is much easier to introduce new features.

vm->scopes[NJS_SCOPE_MODULE + i] = (njs_value_t *) values;
It's similar the closure, and I also don't want to regard module as an anonymous function

While you way of allocating separate scope for each new module may work, it looks limited to me.
We have only 4 bits for scope in njs indexes. 9-10 are already used, we have only 6 values left.

don't want to regard module as an anonymous function

why do you think anonymous functions are not applicable here?

njs_value_t lifetime is permanent now.
If we start only with the following syntax:
```js
lib.js
...
export { MY_CONST, myFunc }

main:
import * as lib from 'lib';
`` I think, we can transform: export { MY_CONST, myFunc }`
into njs_object creation + return statement. AFAIK, we do not have to care about scope type for variables inside module, they only have to be local (closures are not local, not every variables should be available outside the module).

import * as lib from 'lib' into var lib = load_module('lib') instructions.
Where load_module() is internal function, which finds already precompiled lambdas, and calls them.

@hongzhidao

Some more thoughts:
1) main code will only work with var lib index, which is global index.
2) assignment instruction would move njs_value_t of the export object into that index.
3) lib code is completely independent from main code; main does not care which scope types are used inside lib code.

The situation with closures is different. You can have several layers of enclosed closures, each of them has its own scope, and each has direct access to variables of closures above.

I think, we can transform:
export { MY_CONST, myFunc }
into njs_object creation + return statement. AFAIK, we do not have to care about scope type for variables inside module, they only have to be local (closures are not local, not every variables should be available outside the module).

Good idea. I'll consider regarding the module as an anonymous function.

  1. Do you mean variables outside function is not allowed?
// lib.js
var a;
function foo() {};
  1. Can modules access global variables?

@hongzhidao

  1. Do you mean variables outside function is not allowed?

No, they are allowed.

// lib.js
var a=1, b=10;
function foo() {return b++;};
export {a, foo};

My idea, we have to transform it internally into something like ->

// lib.js
(function (/* no args */) {
var a = 1, b = 10; // b is local to lib
function foo() {return b++;}; // foo() is closure
return {a:a, foo:foo};
})

```js
// main
import * as m from 'lib';
m.foo() // -> 10
m.foo() // -> 11

->

```js
// pseudocode, to be done in C
var __modules = {lib: 
{ lambda: (function (/* no args */) { var a=1, b=10; function foo() {return b++;}; return {a:a, foo:foo}; }), 
  value: undefined }
}

function load_module(name) {
    var module = __modules[name];

    if (!module.value) {
        module.value =  module.lambda();
    }
    return module.value;
}
// main
var m = load_module('lib')
m.foo() // -> 10
m.foo() // -> 11

I see a problem with 16.3.5 requirement with my way (scalar values like number will be copied, not 'referenced'), but it seems minor, and we can solve it later (by using for example NJS_OBJECT_(BOOLEAN|NUMBER|STRING) value types).

  1. Can modules access global variables?

According to this table modules have no access to global this (== global variables are in fact should reside in that object (which is another topic to fix in njs))

While parsing/compiling module as a function, we should enforce it, using special flag in parser for example.

@xeioex

Refactor: Introduced njs_scope_next_index().
https://gist.github.com/hongzhidao/f064ac6732215f7233af4a650d73bb6f

See the demo first.

// main.js
import * as m1 from '/tmp/foo.js';
import * as m2 from '/tmp/foo.js';
import * as m3 from '/tmp/baz.js';
import * as m4 from '/tmp/baz.js';

m1.foo();
m1.foo();
m2.foo();
m2.foo();
m3.foo();
m4.foo();
// foo.js
var a = 10;
function foo() {
    a++;
    console.log('lib1 foo: ' + a);
    return;
}
export {foo: foo}



md5-94933eaf526696194de0b1b977346c31



// baz.js
var a = 10;
function foo() {
    a++;
    console.log('lib2 foo: ' + a);
}
export {foo: foo}



md5-2ba92fb47fd6baeec2016ddb662cf52c



'lib1 foo: 11'
'lib1 foo: 12'
'lib1 foo: 13'
'lib1 foo: 14'
'lib2 foo: 11'
'lib2 foo: 12'



md5-6a3c7b7cd731d6a3113704c9c191e8f8



import * as m4 from 'lib.js';  // Support
import { foo, baz } from 'lib.js';  // Not support

export { foo, baz }; // Support
export function foo()  {} // Not support



md5-72a54744e975217e86fefb6e1a1a48de



var a = 1, b = 2;
var c = {a, b}; // var c = {a: a, b: b}



md5-356e92e74f0c18350995fda9092f8a81



// pseudocode, to be done in C
var __modules = {lib: 
{ lambda: (function (/* no args */) { var a=1, b=10; function foo() {return b++;}; return {a:a, foo:foo}; }), 
  value: undefined }
}

function load_module(name) {
    var module = __modules[name];

    if (!module.value) {
        module.value =  module.lambda();
    }
    return module.value;
}
// main
var m = load_module('lib')
m.foo() // -> 10
m.foo() // -> 11



md5-94933eaf526696194de0b1b977346c31



// pseudocode, to be done in C
var __modules = {lib: (function (/* no args */) { var a=1, b=10; function foo() {return b++;}; return {a:a, foo:foo}; })()
}

function load_module(name) {
    var value = __modules[name];
    return value;
}
// main
var m = load_module('lib')
m.foo() // -> 10
m.foo() // -> 11



md5-002650d529ed0d6090a35c5bb0707c51



typedef struct {
    nxt_str_t                   name;
    njs_object_t                object;
/* !!!
 * I can't find a better way to generate index, it happens at parser now.
 * See njs_module_add in njs_module.c line 301.
 */ 
    njs_index_t                 index;    
} njs_module_t;

#define NJS_MODULE_BODY_START   "(function() {"
#define NJS_MODULE_BODY_END     "})()"



md5-b3f45454773d5877260961f83ad45f4b



import . from 'lib'?
import . from './lib'?
import . from 'lib.js'?
import . from './lib.js'?
import . from '/.../lib'?
  1. Native has not been supported yet, I'll try to support.

@hongzhidao Great job! Looks promising.

njs_module_body()

I think,
1) we can expect only normal files (not stdin pipes). So, we do not need realloc(), only malloc(). Because we always know the file size.

2) to avoid copying it again in njs_module_wrap_body() we can insert NJS_MODULE_BODY_START right here.

if (parser->mode == NJS_PARSER_MODE_SCRIPT || scope->nesting != 0) {
https://gist.github.com/xeioex/8cc4ff79ae5cff4cb1ae10be08f02372

BTW, it seems to me it is not necessary. So, we can easily support nested imports.

import * as baz from '/tmp/baz.js'

var a = 10;
function foo() {
        a++;
            console.log('lib1 foo: ' + a);
                return;
}
export {foo: foo, baz:baz.foo}
>> import * as lib from '/tmp/foo.js'
undefined
>> lib.foo()
'lib1 foo: 11'
undefined
>> lib.baz()
'lib2 foo: 11'
undefined
>> 

It's better to only use 'namespace', that we can make restrictions for unifying style.

Agree.

We need think of supporting shorthand property names

Agree.

How to find the module file, which type do you like the most?

That is a big question. I have to discuss it with colleagues.
I personally prefer the shortest version
import . from 'lib'

Error information, I prefer to add the file name before token line such as xxx in foo.baz:3.

Agree.

I will look deeper into the code tomorrow.

Yep, if we regard modules as calling expression, it's easy to support nested import.
What I want to do better is how to deal with the index/value of module.
It's calculated at 'phaser' (see njs_module_add in njs_module.c line 301).
Ideally, it's calculated at 'generator'. And the following is not graceful.

static nxt_int_t
njs_generate_module(njs_vm_t *vm, njs_generator_t *generator,
    njs_parser_node_t *node)
{
    nxt_int_t           ret;
    njs_module_t        *module;

    module = (njs_module_t *) node->index; (i)

    node->index = module->index; (ii)

    if (node->left != NULL) {
        ret = njs_generator(vm, generator, node->left);
        if (nxt_slow_path(ret != NXT_OK)) {
            return ret;
        }
    }

    return NXT_OK;
}

DONE
parser:

    node = njs_parser_node_alloc(vm);
    if (nxt_slow_path(node == NULL)) {
        return NJS_TOKEN_ERROR;
    }

    node->token = NJS_TOKEN_MODULE;
    node->left = parser->node;
    node->index = module->index;

    if (parser->node) {
        parser->node->dest = node;
    }

generator:

static nxt_int_t
njs_generate_module(njs_vm_t *vm, njs_generator_t *generator,
    njs_parser_node_t *node)
{
    nxt_int_t          ret;

    if (node->left != NULL) {
        ret = njs_generator(vm, generator, node->left);
        if (nxt_slow_path(ret != NXT_OK)) {
            return ret;
        }
    }

    return NXT_OK;
}

But it'll be better if we can eliminate NJS_TOKEN_MODULE, or it's fine to introduce this token.

  • we can expect only normal files (not stdin pipes). So, we do not need realloc(), only malloc(). Because we always know the file size.
  • to avoid copying it again in njs_module_wrap_body() we can insert NJS_MODULE_BODY_START right here.

Introduced njs_module_file_t.

@hongzhidao

typedef struct {
    int             fd;
    char            *file;
    ssize_t         body_size; // < I am not sure that we need a special field for it
                                           // body_size = length - nxt_length(NJS_MODULE_BODY_START)
                                           //                     - nxt_length(NJS_MODULE_BODY_END)

    char            *wrap_start; // < I am not sure that we need a special field for it
    char            *wrap_end;  // < I am not sure that we need a special field for it

    size_t          length;
    u_char          *start;
} njs_module_file_t;

Why not simply allocate a single malloc() block and copy module prefix and suffix?
We can replace fields with macros, like

#define njs_module_body(mf) (mf->start + nxt_length(NJS_MODULE_BODY_START))

Why not simply allocate a single malloc() block and copy module prefix and suffix?

What's the size value passed to malloc()?
Is nxt_length(NJS_MODULE_BODY_START) + module_body_size + nxt_length(NJS_MODULE_BODY_END)
or module_body_size?

@xeioex simplified the njs_module_file_t, take a look again.

What's the size value passed to malloc()?

nxt_length(NJS_MODULE_BODY_START) + module_body_size + nxt_length(NJS_MODULE_BODY_END)

I think, we can tolerate duplication of NJS_MODULE_BODY_START, because it is small.
But it simplifies the code.

@hongzhidao

njs_module_size()

I think, it is better to merge it with njs_module_wrap_body() + njs_module_body() and name as njs_module_open().
So njs_module_file_t.fd is also not needed.

char *file;

IMHO, it is better to make it nxt_str_t.

size_t length;
u_char *start;

nxt_str_t body; , also.

@xeioex Refactor: remove njs_module_file_t.

https://gist.github.com/hongzhidao/00160a9661c072b98c11a50cb781a6d6#file-module-01-patch-L269

To simplify a bit,

goto fail;

...

fail:

    if (body->start != NULL) {
        free(body->start);
    }

    close(fd);

    return NXT_ERROR;

https://gist.github.com/hongzhidao/00160a9661c072b98c11a50cb781a6d6#file-module-01-patch-L273

What would happen if you try to open a directory? Probably, here we need a syntax error? What nodejs does here?

What would happen if you try to open a directory? Probably, here we need a syntax error?
It'll cause an error while reading.

SyntaxError: failed to read file: '/tmp' (Is a directory) in 3

What nodejs does here?

It caused the same error.

Error: Cannot find module 'http2'
Error: Cannot find module '/tmp'

How is this?

SyntaxError: Cannot find module '/tmp/foo2.js' in 3
njs_parser_syntax_error(vm, parser,
                                "Cannot find module '%.*s'",
                                (int) name->length, name->start);

(I don't like this format)
Error: Cannot find module '/tmp/foo2.js'
njs_error(vm, parser,
               "Cannot find module '%.*s'",
               (int) name->length, name->start);

I'll unify the error information, it'll be processed later.
At first, the most important thing is to make the code run passed and make sure the design of implementation is OK, sorry for that I didn't mention the initial intention.

I'll unify the error information, it'll be processed later.

OK. It is a separate topic.

design of implementation is OK

I am going to look into design deeper today.

@hongzhidao

How to find the module file, which type do you like the most?

spec says:

It refers to those modules via module specifiers, strings that are either:
Relative paths ('../model/user'): these paths are interpreted relatively to the location of the importing module. The file extension .js can usually be omitted.
Absolute paths ('/lib/js/helpers'): point directly to the file of the module to be imported.

I suggest,
1) Through njs_vm_opt_t njs gets a list of path prefixes. njs has to check them in order (from first to last).
2) if 'module specifiers' is absolute (starts from '/'), it is used as is.
3) otherwise: try every combination of 'prefix' + 'module specifiers' + ('.js')
4) nginx modules need js_modules directive (similar to perl_modules).
5) njs CLI needs command line option (-m '/prefix1', -m '/prefix2') (and possible NJS_PATH env variable support).
6) parent and cloned VMs share the list in njs_vm_shared_t.

@hongzhidao

It is also a good time to add auto tests.
1) add support for modules paths to njs cli.
2) add some tests to njs_expect_test.exp

@hongzhidao

In general, I am OK with the current design. It is simple enough, but it is a good start and we improve it gradually.

@xeioex

  1. See the draft of processing error.
    https://gist.github.com/hongzhidao/597084054c50a3c13a91d2874e9daaaf

I reference to PHP, it's clear enough. What do you think?

// 1.php
<?php
require_once("/tmp/2.php");
// 2.php
<?php
ff aa cc



md5-94933eaf526696194de0b1b977346c31



> php /tmp/1.php
PHP Parse error:  syntax error, unexpected T_STRING in /tmp/2.php on line 2



md5-279757d2ce3a66674b50789b6e874295



-    njs_syntax_error(vm, "%s in %u", buf, parser->lexer->line);
+    njs_syntax_error(vm, "%s on line %u", buf, parser->lexer->line);
  1. Do you think it's time to introduce customized sprintf like nxt_sprintf in UNIT?
  2. Actually, I really want to rename nxt_mem_cache_pool as nxt_mp, I prefer the style in UNIT, it's a litter too long now.

@hongzhidao

I would prefer the concise syntax:
file.name:<line>

The same as lua, for example,

2017/10/05 12:49:37 [notice] 24894#24894: start worker process 24896
2017/10/05 12:50:53 [error] 24895#24895: *1 failed to run balancer_by_lua*: balancer_by_lua:2: module 'dnsutil' not found:
        no field package.preload['dnsutil']
  ...

or gcc

njs/njs_parser.c: In function ‘njs_parser_escape_string_create’:
njs/njs_parser.c:2576:5: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘uint64_t’
     uint64_t      u;
     ^
njs/njs_parser.c:2607:21: error: ‘hex_length’ undeclared (first use in this function)
                     hex_length = 4;

Moreover, we already have this format, but without filenames so far,

2017/10/12 16:22:42 [error] 4326#4326: *1 js exception: SyntaxError: Unexpected end of input at position 0
at JSON.parse
at bar (:33)
at exception_var (:37)
at main

No problem, I didn't notice it.

  1. Use the format '%s:%u'?
-    njs_syntax_error(vm, "%s in %u", buf, parser->lexer->line);
+    njs_syntax_error(vm, "%s in %s:%u", buf, parser->lexer->file, parser->lexer->line);
  1. The structure 'njs_lexer_t' need a field 'file' (char */nxt_str_t) to indicate the file infomation?
    I think it's better to put it in njs_lexer_t than in njs_parser_t.

Do you think it's time to introduce customized sprintf like nxt_sprintf in UNIT?

it is.

Actually, I really want to rename nxt_mem_cache_pool as nxt_mp, I prefer the style in UNIT, it's a litter too long now.

Agree.

Use the format '%s:%u'?

yes.

I think it's better to put it in njs_lexer_t than in njs_parser_t.

agree.

Actually, I really want to rename nxt_mem_cache_pool as nxt_mp, I prefer the style in UNIT, it's a litter too long now.

I want to do the thing first, it's simple and make the code clean.

@xeioex , I hope you like it :)
https://gist.github.com/hongzhidao/e2c41b448754797ef0d8b22b8dd8134d

Actually, I really want to rename nxt_mem_cache_pool as nxt_mp, I prefer the style in UNIT, it's a litter too long now.

Will do.

Great job, we do the same thing, ah.

    1.
-const nxt_mem_proto_t  njs_vm_mem_cache_pool_proto = {
+const nxt_mem_proto_t  njs_vm_mp_proto = {
     njs_alloc,
  1. replace mcp as mp.
  2. rename vm->pool as vm->mem_pool like in UNIT.
struct nxt_runtime_s {
    nxt_mp_t               *mem_pool;

I should notice your reply first. At that time, I was enjoying in refactoring the same thing. But anyway, you can review the patch I uploaded, it's kind of different with you.
https://gist.github.com/hongzhidao/e2c41b448754797ef0d8b22b8dd8134d

@hongzhidao
1, 2 - DONE

rename vm->pool as vm->mem_pool like in UNIT

don't you think vm->pool is a bit shorter? And more in line with nxt_lvlhsh_query_t.

don't you think vm->pool is a bit shorter?

I personally prefer mem_pool, maybe I'm affected by UNIT.
And it can be well to difference from nginx's pool. (static vs dynamic)

And more in line with nxt_lvlhsh_query_t.

It's different, because the pool of nxt_lvlhsh_query_t is a hook, and it's type is 'void *'.

And it can be well to difference from nginx's pool. (static vs dynamic)

OK. to be in line with UNIT.

I like the way you maintain NJS.
And we can port some code from UNIT, such as nxt_array:

In NJS: NXT_EXPORT void *nxt_array_add(nxt_array_t *array, const nxt_mem_proto_t *proto,
    void *pool);
In UNIT: NXT_EXPORT void *nxt_array_add(nxt_array_t *array);

It's more concise. (I can do it.), but it's also important that keep njs as minimal as possible.

+    mp = nxt_mp_create(&njs_vm_mp_proto, NULL, NULL, 2 * nxt_pagesize(),
+                        128, 512, 16);
+    if (nxt_slow_path(mp == NULL)) {

style? (2 places)

+    mp = nxt_mp_create(&njs_vm_mp_proto, NULL, NULL, 2 * nxt_pagesize(),
+                       128, 512, 16);
+    if (nxt_slow_path(mp == NULL)) {

Style

-    requests = nxt_mem_cache_zalloc(vm->mem_cache_pool,
-                                    nxt_nitems(nxt_test_requests)
-                                    * sizeof(njs_unit_test_req_t));
+    requests = nxt_mp_zalloc(vm->mem_pool, nxt_nitems(nxt_test_requests)
+                                       * sizeof(njs_unit_test_req_t));

In nxt_mp.h/c, I prefer to name mp, it's important. (Reference to UNIT)

-NXT_EXPORT nxt_bool_t nxt_mem_cache_pool_is_empty(nxt_mem_cache_pool_t *pool);
-NXT_EXPORT void nxt_mem_cache_pool_destroy(nxt_mem_cache_pool_t *pool);
+NXT_EXPORT nxt_bool_t nxt_mp_is_empty(nxt_mp_t *pool);
+NXT_EXPORT void nxt_mp_destroy(nxt_mp_t *pool);
-NXT_EXPORT nxt_bool_t nxt_mem_cache_pool_is_empty(nxt_mem_cache_pool_t *pool);
-NXT_EXPORT void nxt_mem_cache_pool_destroy(nxt_mem_cache_pool_t *pool);
+NXT_EXPORT nxt_bool_t nxt_mp_is_empty(nxt_mp_t *mp);
+NXT_EXPORT void nxt_mp_destroy(nxt_mp_t *mp);

@hongzhidao applied.

https://gist.github.com/xeioex/ab35a00b4a6993b38d2d22b053439369

If no major correction is needed I plan to commit in 15 minutes.

Hi @xeioex @hongzhidao !

//! should fail with: SyntaxError: Identifier 'x' has already been declared
import * as x from './module_simple.js';
import * as x from './module_simple.js';

//+ ok
//import * as x from './module_simple.js';
//import * as xx from './module_simple.js';
//console.log(x, 'true:', x === xx);

//! an object expected
import * as y from './module_empty.js';
console.log(y, typeof y);

//! should fail before all above
import * as z from './module_error.js';
$ cat module_simple.js 
var x = 0;
var y = [];
var z = function(...a) {

};

function xyz() {
x = 0;
}

export { x: x, y: y, z: z, zyz: xyz };
$ cat module_empty.js 
$ cat module_error.js 
throw new Error("module_error");

@drsm
Thank you, I'll see it tomorrow.

@hongzhidao committed, thanks!

@xeioex @drsm

The previous design should be changed.

See the demo first.

// main.js
console.log('main start');

import * as m1 from '/tmp/lib1.js';
import * as m2 from '/tmp/lib2.js';

m1.foo();
m1.foo();
m1.foo();
m2.foo();
m2.foo();

console.log('main end');
// lib1.js
console.log('lib1 module');

var a = 10;

function foo()
{
    a++;
    console.log('lib1 foo: ' + a);
}

return {foo: foo}



md5-94933eaf526696194de0b1b977346c31



// lib2.js
console.log('lib2 module');

var a = 10;

function foo()
{
    a++;
    console.log('lib2 foo: ' + a);
}

return {foo: foo}



md5-5cc7c3a5522954723a7eec35c28d560e



'lib1 module'
'lib2 module'
'main start'
'lib1 foo: 11'
'lib1 foo: 12'
'lib1 foo: 13'
'lib2 foo: 11'
'lib2 foo: 12'
'main end'



md5-e24b892e1367f881682bb7c80eb5e612



// lib.js
(function (/* no args */) {
var a = 1, b = 10; // b is local to lib
function foo() {return b++;}; // foo() is closure
return {a:a, foo:foo};
})
// main
import * as m from 'lib';
m.foo() // -> 10
m.foo() // -> 11
->

// pseudocode, to be done in C
var __modules = {lib: 
{ lambda: (function (/* no args */) { var a=1, b=10; function foo() {return b++;}; return {a:a, foo:foo}; }), 
  value: undefined }
}

foreach (__modules as module) {
  module.value =  module.lambda();
}

function load_module(name) {
    var module = __modules[name];
    return module.value;
}
// main
var m = load_module('lib')
m.foo() // -> 10
m.foo() // -> 11



md5-ebe5ad369b38fb40c4ef264d303c1e51



var a;
function foo() {}
export {foo:foo}
export {foo:foo} // Not allowed since foo has been exported.
export {a:a} // export can be used many times and conbime together
// Finally, the function returns {foo:foo, a:a}

Draft. (export has not been processed yet)
https://gist.github.com/hongzhidao/a87aca99920eee77444e2b8c75ccb529

! should fail with: SyntaxError: Identifier 'x' has already been declared

Agree, will be processed later.

! an object expected

Agree, will be processed later.

! should fail before all above
import * as z from './module_error.js';

Be solved by the patch.

@hongzhidao

All the modules should be run before the main script.
Regard modules as lambda functions with permanent index/value.
export is different from the return.

agree.

export has not been processed yet, and it's not good to run modules in njs_vm_start

Spec says Module imports are hoisted (internally moved to the beginning of the current scope). We can try it.

The idea is to process the node tree after parsing, and move nodes related to import to the top.

@hongzhidao

Do you think it's time to introduce customized sprintf like nxt_sprintf in UNIT?

I am on it.

Spec says Module imports are hoisted (internally moved to the beginning of the current scope). We can try it.

The idea is to process the node tree after parsing, and move nodes related to import to the top.

I think the purpose of 'hoist' is to make 'import' be used anywhere. Now we've allocated index (module->index) before parsing. So it's ok to put the nodes related to import after 'NJS_TOKEN_MODULE'.
(But it only processes lambda, import nodes are still generated by order. I'll think of it.)

I updated it again.
https://gist.github.com/hongzhidao/a87aca99920eee77444e2b8c75ccb529

@hongzhidao

and it's not good to run modules in njs_vm_start

I wonder, why not to call njs_vm_module() from njs_vm_start()? Fro usage perspective, it seems too low level.

@hongzhidao

and it's not good to run modules in njs_vm_start

I wonder, why not to call njs_vm_module() from njs_vm_start()?

OK, I'll move it into vm_start, here's new updated patch.
https://gist.github.com/hongzhidao/9b2f7a0a14510dcb9e6a639659201a74

@hongzhidao

https://gist.github.com/hongzhidao/9b2f7a0a14510dcb9e6a639659201a74#file-module-03-patch-L51
vm->current = current;

I think we can move it out of loop.

@xeioex
This code is very similar to njs_vm_call. I'll process it again.
Now, I'm on njs_parser.c, I'm thinking of "Module imports are hoisted".
Ideally, I hope parser->scope->node is always unchanged and introduce njs_parser_statement_add().
This function will push new node related to statement to parser->scope->node.
Thus it's more flexible to change new generated nodes position.
And parser->node is a temporary node recording the result of njs_parser_statement/_chain.

Ideally, I hope parser->scope->node is always unchanged.

Good idea.

@hongzhidao

https://gist.github.com/hongzhidao/9b2f7a0a14510dcb9e6a639659201a74#file-module-03-patch-L51
vm->current = current;

I think we can move it out of loop.

@xeioex Introduced njs_function_continuation(), what do you think of?
https://gist.github.com/hongzhidao/82acf5624516e4a47b3f00b42f2ee40c

@xeioex Introduced njs_function_continuation(), what do you think of?
https://gist.github.com/hongzhidao/82acf5624516e4a47b3f00b42f2ee40c

@hongzhidao
I do not like the name of the new function (IMHO, it is long and confusing).
How about extending njs_vm_call with retval index?

rename njs_vm_call(...) -> njs_vm_invoke(..., retval)
njs/njs.h:
#define njs_vm_call(...) njs_vm_invoke(..., NJS_INDEX_GLOBAL_RETVAL)

define njs_vm_call(...) njs_vm_invoke(..., NJS_INDEX_GLOBAL_RETVAL)

Good, updated.
https://gist.github.com/hongzhidao/9b2f7a0a14510dcb9e6a639659201a74

Hi @xeioex @hongzhidao !
How about an idea to pregenerate code for some type of storage first, and then generate modules code?
something like this (complete pseudo code):
https://gist.github.com/drsm/ccd187fac4e6a5c9b8caa5f2368941b9

@drsm thank you.

  1. Actually, it doesn't matter to the order of generating modules code. Since the result of generating modules codes is to set lambda->start and we've precalculated the index/value of lambda function.
    I think it's ok now.
  2. What I'm not sure is the implementation of 'export', I'll think of your suggestion.
  3. I think it's better to implement 'Module imports are hoisted', I need to deep into njs_parser.c.
  4. The format 'import { clear as c2 }' does not plan to support recently, see:

It's better to only use 'namespace', that we can make restrictions for unifying style.
Agree.

  1. Whether support cyclic dependencies depends on the first version implementation.

We can start with the limited case, where import is only available in main. But, I think, it should not be much harder to do (we can simply do not support cyclic dependencies).

@hongzhidao

What I'm not sure is the implementation of 'export', I'll think of your suggestion.

I see two options here:
1) support, for now, only one export instruction (hoist it down and make return out of it).
2) more advanced version


    1. make an object at start of the module (__export). // NJS_TOKEN_OBJECT, njs_vmcode_object during generation phase


    1. transform export {..} into

      for k,v in {...}: __export[k] = v; // njs_vmcode_property_set

You can find some ideas here:
njs_parser_object()


    1. return __export at the end of the module.

@hongzhidao @xeioex
is there a simple solution for this case?

//1.js
import * as exp from './2.js';

console.log(exp.a);
exp.b();
console.log(exp.a);

//2.js
var a = 0;

function b() {
  a++;
}

//export { a, b };
//export { a as a, b as b };
export { a: a, b: b };

can we transform it into:

var _e = { 2: { a: undefined, b: undefined }};
(function() {
_e[2].a = 0;

function b() {
  _e[2].a++;
}
_e[2].b = b;
})();
console.log(_e[2].a);
_e[2].b();
console.log(_e[2].a);

@hongzhidao @xeioex
is there a simple solution for this case?

//1.js
import * as exp from './2.js';

console.log(exp.a);
exp.b();
console.log(exp.a);

//2.js
var a = 0;

function b() {
  a++;
}

//export { a, b };
//export { a as a, b as b };
export { a: a, b: b };

can we transform it into:

var _e = { 2: { a: undefined, b: undefined }};
(function() {
_e[2].a = 0;

function b() {
  _e[2].a++;
}
_e[2].b = b;
})();
console.log(_e[2].a);
_e[2].b();
console.log(_e[2].a);

See this and each module should be isolate.

// lib.js
(function (/* no args */) {
var a = 1, b = 10; // b is local to lib
function foo() {return b++;}; // foo() is closure
return {a:a, foo:foo};
})
// main
import * as m from 'lib';
m.foo() // -> 10
m.foo() // -> 11
->

// pseudocode, to be done in C
var __modules = {lib: 
{ lambda: (function (/* no args */) { var a=1, b=10; function foo() {return b++;}; return {a:a, foo:foo}; }), 
  value: undefined }
}

foreach (__modules as module) {
  module.value =  module.lambda();
}

function load_module(name) {
    var module = __modules[name];
    return module.value;
}
// main
var m = load_module('lib')
m.foo() // -> 10
m.foo() // -> 11

@hongzhidao thank you.

See this and each module should be isolate.

ok, so as for now we can export only constants...

@hongzhidao thank you.

See this and each module should be isolate.

ok, so as for now we can export only constants...

No, variables are allowed.
We'll implement it as stand-way but as minimal as possible. Suggestions are welcome.

It's better to create a new issue and rearrange what's clear and will be implemented.

We'll implement it as stand-way but as minimal as possible.

Agree. In this ticket we are implementing minimal viable modules support. We can extend functionality later.

@hongzhidao

more advanced version

BTW, we can easily avoid (i) and (iii) by extending wrapper
from (function(){/*module*/}) ->

js (function() { var __export = {}; (function(__ex) {/*module*/})(__export); return __export; })
index for '__ex' is known at compile time ((njs_index_t) (1 | NJS_SCOPE_ARGUMENTS)).

So, what is left to do is to transform export {...} expressions.

The discussion is moved to #88.

Was this page helpful?
0 / 5 - 0 ratings