Njs: WIP: import module (part 2)

Created on 31 Jan 2019  路  86Comments  路  Source: nginx/njs

Hi @xeioex

Improved njs_parser_node_alloc().
https://gist.github.com/hongzhidao/ee4e004082e46f44cd2d0b23104c2fed

One place I'm not sure
https://gist.github.com/hongzhidao/ee4e004082e46f44cd2d0b23104c2fed#file-parser-01-patch-L233

diff -r ac9b1c01a9b0 njs/njs_parser.c
--- a/njs/njs_parser.c  Wed Jan 30 18:50:33 2019 +0300
+++ b/njs/njs_parser.c  Tue Jan 29 11:26:34 2019 +0800
@@ -1043,6 +1043,7 @@ njs_parser_switch_statement(njs_vm_t *vm
                         return NJS_TOKEN_ERROR;
                     }

+                    branch->token = NJS_TOKEN_CASE;
                     branch->right = node;

                } else {

https://github.com/nginx/njs/issues/86 @VBart @drsm welcome to here.

feature

All 86 comments

@hongzhidao

Improved njs_parser_node_alloc().
https://gist.github.com/hongzhidao/ee4e004082e46f44cd2d0b23104c2fed

Suggestions:

#define  njs_parser_node_new(vm, parser) njs_parser_node_alloc(vm, parser, NJS_TOKEN_ILLEGAL)

@xeioex
Do you mean this?

@@ -134,7 +136,7 @@ njs_parser(njs_vm_t *vm, njs_parser_t *p
     if (node == NULL) {
         /* Empty string, just semicolons or variables declarations. */

-        node = njs_parser_node_alloc(vm, parser, NJS_TOKEN_DUMMY);
+        node = njs_parser_node_new(vm, parser);
         if (nxt_slow_path(node == NULL)) {
             return NXT_ERROR;
         }

@hongzhidao yes. To get rid of extra special token.

How about?

njs_parser_node_alloc -> njs_parser_node_new
njs_parser_node_new -> njs_parser_default_node

I prefer to eliminate njs_parser_node_alloc.

njs_parser_node_alloc -> njs_parser_node_new

is OK for me.

njs_parser_node_default

I do not have a good alternative too. What do you think of njs_parser_node_new(vm, parser, 0)? It is short, and less confusing than NJS_TOKEN_ILLEGAL.

@hongzhidao

https://gist.github.com/hongzhidao/238a65df6d2245844c8a148d1635b8f0#file-parser-02-patch-L518

what is the point of moving it up?

@xeioex It's my mistake. Can you fix it?

diff -r 100ca8dab64f njs/njs_parser.c
--- a/njs/njs_parser.c  Tue Jan 29 16:30:13 2019 +0800
+++ b/njs/njs_parser.c  Tue Jan 29 16:58:38 2019 +0800
@@ -2076,8 +2076,6 @@ njs_parser_builtin_object(njs_vm_t *vm,
         scope = scope->parent;
     }

-    node->scope = parser->scope;
-
     hash = parser->lexer->key_hash;
     name = &parser->lexer->text;

@@ -2097,6 +2095,8 @@ njs_parser_builtin_object(njs_vm_t *vm,
         return NJS_TOKEN_ERROR;
     }

+    node->scope = parser->scope;
+
     parser->node = node;

     return njs_parser_token(parser);
@@ -2120,8 +2120,6 @@ njs_parser_builtin_function(njs_vm_t *vm
         scope = scope->parent;
     }

-    node->scope = parser->scope;
-
     hash = parser->lexer->key_hash;
     name = &parser->lexer->text;

@@ -2141,6 +2139,8 @@ njs_parser_builtin_function(njs_vm_t *vm
         return NJS_TOKEN_ERROR;
     }

+    node->scope = parser->scope;
+
     parser->node = node;

     return njs_parser_token(parser);

Can you fix it?

I can. Just asking. Maybe there is some deeper meaning:)

@hongzhidao Committed. Thanks!

Can you fix it?

I can. Just asking. Maybe there is some deeper meaning:)

@xeioex
Simplified parser builtin function.
https://gist.github.com/hongzhidao/6e4c7e554b30a729bb958b4d9061cf7e

It's better to remain the node scope unchanged after allocation.

@xeioex

Improved parser: seperate parser->scope->node from parser->node.
https://gist.github.com/hongzhidao/4d63792988f701f4f987b4e8f4146ffc

Now, I'm on njs_parser.c, I'm thinking of "Module imports are hoisted".
Ideally, I hope parser->scope->node is always unchanged and introduce njs_parser_statement_add().
This function will push new node related to statement to parser->scope->node.
Thus it's more flexible to change new generated nodes position.
And parser->node is a temporary node recording the result of njs_parser_statement/_chain.

Then I think it's easy to change any node's position, in order to implement 'Module imports are hoisted'.
We need only add parser->node to scope->node. On the other way, it makes the parser clearer.

@hongzhidao

It's better to remain the node scope unchanged after allocation.
On the other way, it makes the parser clearer.

Agree. In general, we need more invariants in the parser code to make it simpler to change and understand. The current code has a lot of ad hoc places.

@hongzhidao

njs_parser_node_add()

the idea is to update parser->scope->node in a single place?

In general, as I said above, I am positive about making the code more regular and structured.

1) can't we replace njs_parser_statement_chain() completely? I do not like the idea of having two similar functions.
2) https://gist.github.com/hongzhidao/4d63792988f701f4f987b4e8f4146ffc#file-parser-04-patch-L43

the invariant of updating parser->scope->node in a single place is not preserved here.
Probably we need a macro like
#define njs_parser_node_set(parser, node) parser->scope->node = node.

So we update parser->scope->node only in njs_parser_node_add() or in njs_parser_node_set().
What do you think?

the invariant of updating parser->scope->node in a single place is not preserved here.

Yes. I think it's not important, I'll explain later.

define njs_parser_node_set(parser, node) parser->scope->node = node.

Agree

can't we replace njs_parser_statement_chain() completely? I do not like the idea of having two similar functions.

How about adding a parameter 'scope' to determine where to put in? (NULL: parser->node; scope: scope)

static njs_token_t
njs_parser_statement_chain(njs_vm_t *vm, njs_parser_t *parser,
    njs_parser_scope_t *scope, njs_token_t token)

njs_parser_node_add()
the idea is to update parser->scope->node in a single place?

Push new node to parser->scope->node.

My thoughts about parser.

  1. njs_parser_statement() is to parse a complete statement, it's clear now. Especially it sets parser->node to NULL.
  2. njs_parser_statement_chain() is a wrapper and it results in a chain of parser->node, I also like it. In other words, we need a function doing this (resulting in a group of parser nodes).
  3. njs_parser_statement_chain() is to result a temporary node stored in parser->node, we need to add the temporary node (parser->node) to the current scope. This is what the patch does. And the parser->scope->node is not invariant, but we can do it by preallocating its value, I think it's unnecessary now.
  4. Because the parser contains recursion, from the result, I hope each statement which begins from start (not in a block) has complete nodes (node or set nodes) stored in parser->node, then the parser->node is added to the current scope.
    What about?
    Deleted.

@hongzhidao

What about?
How about adding a parameter 'scope' to determine where to put in? (NULL: parser->node; scope: scope)

If we just need two different destinations, why not not use njs_parser_scope_t ** argument?

Refactored parser.

It is much better to see the complete patch (not a patch over another patch).

@xeioex

Updated.
https://gist.github.com/hongzhidao/e98debabd547cfde626f70cef78a78f8

If we just need two different destinations, why not not use njs_parser_scope_t ** argument?

I can't get it but see the patch first.

In general, as I said above, I am positive about making the code more regular and structured.

BTW, how about introducing njs_parser_expression_statement? Move all the statements in default to a single statement.

@hongzhidao

https://gist.github.com/hongzhidao/4d63792988f701f4f987b4e8f4146ffc

I looked at it again. I am not sure we need this change. IMHO, it does not simplify the current code much. Actually, it make it a bit more complex.
1) two very similar functions.
2) scope argument, which is applicable only half of the time.

njs_parser_scope_statement() is to result a temporary node stored in parser->node, we need to add the temporary node (parser->node) to the current scope. This is what the patch does.

Why not to reuse njs_parser_statement_chain()? or I am missing something...

#define njs_parser_scope_statement(..)  do { 
    njs_parser_statement_chain(...);
    parser->scope->node = parser->node;
} while(0)

Sorry for the updated patch.
See this.
https://gist.github.com/hongzhidao/e98debabd547cfde626f70cef78a78f8
Well, Maybe it's better to decide until the implementation of 'import' keyword since the import statement needs to be hoisted. But I still think the patch makes parser easier to understand. See the part.

@@ -739,23 +746,23 @@ njs_parser_function_lambda(njs_vm_t *vm,
-    parent->right = parser->node;
-
-    parser->scope->node = parser->node;
+    parent->right = parser->scope->node;

In another way, maybe you would like it.
First, in njs_parser_scope_begin.
scope->node = njs_parser_node_new(...);
Second, in njs_parser_statement_chain.
add parser->node to current scope.
But it seems unnecessary.

In this case.

// main.js
lib.foo();
import * as lib from 'foo.js';
// foo.js
function foo() {
    console.log('lib foo');
}
export {foo}

Should result in: 'lib foo'
But in the current design, it seems we can't add the node generated by import statement to the current scope node.

@hongzhidao

Well, Maybe it's better to decide until the implementation of 'import' keyword since the import statement needs to be hoisted.

Agree.

Other ideas how to 'hoist' import statements:
1) Why not traverse the parser tree after it is ready and move import statements to the beginning?
2) Or, maybe even better, while building the tree, do not add NJS_TOKEN_IMPORT nodes as usual, but in a separate imports list. After the tree is ready, concatenate them.

hi!
also, hidden imports happen in module redirects.

@xeioex

See the demo first.

Result:

'start lib module'
'end of lib module'
'start main script'
'foo foo: 11'
'foo foo: 12'
'foo foo: 13'
'end of main script'

main.js

console.log('start main script');
lib.foo();
lib.foo();
lib.foo();
import * as lib from '/tmp/foo.js';
import * as lib from '/tmp/foo.js';
import * as lib from '/tmp/foo.js';
console.log('end of main script');

lib.js

console.log('start lib module');
var a = 10;
function foo() {
    a++;
    console.log('foo foo: ' + a);
}
console.log('end of lib module');
return {foo: foo}

Patch. (including we just discussed)
https://gist.github.com/hongzhidao/acfe56ef42828e8b7216ee67e6a804e8

Or, maybe even better, while building the tree, do not add NJS_TOKEN_IMPORT nodes as usual, but in a separate imports list. After the tree is ready, concatenate them.

I think NJS_TOKEN_IMPORT is meaningful, it indicates the import statement exists in the parser tree.

Why not traverse the parser tree after it is ready and move import statements to the beginning?
(plus: in a separate imports list. After the tree is ready, concatenate them.)

Since there may be non-import statements before and after import statements, we need to process both them. But compare to the implementation of the patch, I prefer to make the move action simpler. Only relink the leftmost import node.

@xeioex

About new parameter 'scope' of njs_parser_statement_chain.

static njs_token_t njs_parser_statement_chain(njs_vm_t *vm,
    njs_parser_t *parser, njs_parser_scope_t *scope, njs_token_t token);

In main script parser 'njs_parser' and function/module parser 'njs_parser_function_lambda', both them pass the current scope to 'njs_parser_statement_chain'.
In other parser, especially parse block by 'njs_parser_block_statement', it sets NULL.
So as I said the meaning of parser->node is storing the temporary node generated by 'njs_parser_statement', corresponding to the start of 'njs_parser_statement', it sets parser->node NULL.
Then see the implementation of 'njs_parser_statement_chain'.

static njs_token_t
njs_parser_statement_chain(njs_vm_t *vm, njs_parser_t *parser,
    njs_token_t token)
{
    njs_parser_node_t  *node, *last;
    last = parser->node;
    token = njs_parser_statement(vm, parser, token);  // parser->node = NULL in begining of njs_parser_statement
    ...
            node = njs_parser_node_new(vm, parser, NJS_TOKEN_STATEMENT);
            node->left = last;
            node->right = parser->node;
            parser->node = node;
    ...

It shows the intention of parser->node is 'storing the temporary node generated by 'njs_parser_statement and will be added to the current scope'.

@hongzhidao

improvements except for njs_parser_statement_chain():

https://gist.github.com/hongzhidao/acfe56ef42828e8b7216ee67e6a804e8#file-patch-06-patch-L276

index is already allocated in njs_module_add()

https://gist.github.com/hongzhidao/acfe56ef42828e8b7216ee67e6a804e8#file-patch-06-patch-L34

typo? https://github.com/nginx/njs/commit/4f934dc5e7b9208a09f9ef1f03cce1ebbd4d508b

ret = njs_module_open(vm, parser, name, &body)

I think, now, function is best described as njs_module_read().

https://gist.github.com/hongzhidao/acfe56ef42828e8b7216ee67e6a804e8#file-patch-06-patch-L360

Here and below. After https://github.com/nginx/njs/commit/f237ae4b1a23400541c646825bc9d81f9782dbe3 all quotes in exceptions should be (double quotes ")

https://gist.github.com/hongzhidao/acfe56ef42828e8b7216ee67e6a804e8#file-patch-06-patch-L774

typo?

https://gist.github.com/hongzhidao/acfe56ef42828e8b7216ee67e6a804e8#file-patch-06-patch-L839

Here and below. %V specifier should be used.

index is already allocated in njs_module_add()

Done

typo? 4f934dc

The njs_vm_call is called after njs_vm_start() in nginx module.

1. CLI
njs_vm_start() {
   foreach(modules as module) {
      njs_vm_invoke(module.function);
   }
   njs_vm_interpreter();
}
2. NGINX module
njs_vm_start().
njs_vm_call()/njs_vm_invoke()

So I think it's better to call njs_vm_interpreter() rather than njs_vm_start().

I think, now, function is best described as njs_module_read().

Agree.

Here and below. After f237ae4 all quotes in exceptions should be (double quotes ")

I remove all the error informations happened in njs_module_open and process it in the njs_parser_module.

105     njs_parser_syntax_error(vm, parser,
106                             "Cannot find module \"%V\"", name);

https://gist.github.com/hongzhidao/acfe56ef42828e8b7216ee67e6a804e8#file-patch-06-patch-L774
typo?

Yes.

Here and below. %V specifier should be used.

Done

Finally, See the new patch.
https://gist.github.com/hongzhidao/d4fe1c7bc2da0bf3a6464d278f0c690d

My thoughts.

  1. The patch should be split into 3 patches. (related to njs.c, related to njs_parser.c and related to module implementation)
  2. The key point is the change of parser->node meaning. Do you think it makes sense?
  3. The module implementation is based on the second thought.

Or do you think it's better to discuss on the complete patch, including unit tests? Since your suggestion is so helpful, I submit the patch includes design worthy of discussion more frequent.

@hongzhidao

I want to make sure I understand you correctly. Correct me if I am wrong.

njs_parser_statement_chain() is used 4 times:
1)

parser->scope->node:
  in njs_parser 
  in njs_parser_function_lambda 

we use parser->scope->node here because we want to have in parser->scope->node
the list of top level statements?

2)

parser->node
  in njs_parser_block_statement 
  in njs_parser_switch_statement 

We do not use parser->scope->node here because we want to chain parsed statements to switch or block (current) substatements?

If my understanding is right, I agree with the intention of splitting these two cases (if it helps in moving import statements). But, I do not like the resulting API (IMHO, it is confusing).

Here is my suggestion:

typedef njs_dest_chain_t enum
   NJS_CHAIN_CURRENT - parser->node
   NJS_CHAIN_TOP  - parser->scope->node

static njs_token_t njs_parser_statement_chain(njs_vm_t *vm,
    njs_parser_t *parser, njs_token_t token, njs_dest_chain_t chain);


we use parser->scope->node here because we want to have in parser->scope->node
the list of top level statements?

Yes.
Since the parser->scope->node used to invoke generating code.
In njs_vm_compile, corresponding to njs_parser.

    ret = njs_generate_scope(vm, generator, parser->scope);
    if (nxt_slow_path(ret != NXT_OK)) {
        goto fail;
    }

In njs_generate_function_scope, corresponding to njs_parser_function_lambda.

node = node->right;

    ret = njs_generate_scope(vm, generator, node->scope);

    if (nxt_fast_path(ret == NXT_OK)) {

We do not use parser->scope->node here because we want to chain parsed statements to switch or block substatements?

Can you express it simpler? I'm worried I misunderstand you.
We do not use parser->scope->node here because 'njs_parser_block_statement' and 'njs_parser_switch_statement' are substatements, and they'll be concatenated to parser->node, finally the parser->node is added to the current scope.

Here is my suggestion:
typedef njs_dest_chain_t enum
NJS_CHAIN_CURRENT - parser->node
NJS_CHAIN_TOP - parser->scope->node
static njs_token_t njs_parser_statement_chain(njs_vm_t *vm,
njs_parser_t *parser, njs_token_t token, njs_dest_chain_t chain);

Looks good. See this. (a litter different)

  • Deleted.

https://gist.github.com/hongzhidao/d4fe1c7bc2da0bf3a6464d278f0c690d#file-patch-07-patch-L855

%V. %.*s - is not supported by nxt_sprintf().

The patch should be split into 3 patches. (related to njs.c, related to njs_parser.c and related to module implementation)

Agree. I think njs_vm_invoke() patch is ready to be committed.

The key point is the change of parser->node meaning. Do you think it makes sense?

AFAIK, parser->node always was a temporary node. BTW, What do you think of renaming
parser->node -> parser->current. and parser->scope->node -> parser->scope->top. Maybe later.

I submit the patch includes design worthy of discussion more frequent.

This is a good strategy.

including unit tests

I think, some unit tests are already useful at this point. It will speed up the experiments and refactoring.

parser->node always was a temporary node

Yes :)

parser->scope->node -> parser->scope->top

I like it, it's clear.

parser->node -> parser->current

I prefer to parser->node, it looks like a variable with a temporary value.

https://gist.github.com/hongzhidao/46798e5c5376ae177a940097461d5104
njs_parser_statement_chain(..., njs_parser_node_t **dest);

I like it much more!)

#define njs_parser_chain_current(parser)   ...
#define njs_parser_chain_top(parser)     ...

@xeioex

See this.
https://gist.github.com/hongzhidao/4e24d9131d854d4c334ede568256df52

Suggestions are welcome based on the patch.

define njs_parser_chain_current(parser)

It's ok, I'll think of it later.

https://gist.github.com/hongzhidao/46798e5c5376ae177a940097461d5104

At this point, I think, njs_parser_statement_chain patch is almost ready to be committed.

https://gist.github.com/hongzhidao/46798e5c5376ae177a940097461d5104#file-parser-08-patch-L749

Except, I would leave body variable as is (not rename it).

@hongzhidao

When it is ready, please, share the njs_parser_statement_chain patch.

@xeioex

  1. Improved njs_parser_statement_chain().
    https://gist.github.com/hongzhidao/eadb2df55e21af12a4cb78bcd6511990

  2. Are you on this?

    Agree. I think njs_vm_invoke() patch is ready to be committed.

@hongzhidao

Are you on this?
Agree. I think njs_vm_invoke() patch is ready to be committed.

yes. the patch is welcome.

@xeioex

@hongzhidao

Are you on this?
Agree. I think njs_vm_invoke() patch is ready to be committed.

yes. the patch is welcome.

https://gist.github.com/hongzhidao/612d4d71f1740d22bffcd99f1189911e

I think it's better to process until the implementation of module.

@xeioex

It's time to decide the way of file name added in the error information.
The priority is higher than 'module'.

https://gist.github.com/hongzhidao/307616529029564525960fdd48441dfe#file-parser-error-01-patch-L980

./build/njs /tmp/main.js
SyntaxError: Unexpected token "d" in /tmp/error.js:3

main.js

console.log('start main script');
lib.foo();
lib.foo();
lib.foo();
import * as lib from '/tmp/lib.js';
import * as lib from '/tmp/lib.js';
import * as lib from '/tmp/lib.js';
import * as y from '/tmp/empty.js';
import * as e from '/tmp/error.js';
console.log('end of main script');

error.js

var a, b, c;

a = b, c d

@hongzhidao

Here is my version of njs_vm_invoke patch
https://gist.github.com/xeioex/d266dfa24611ece3ab35e5b88881cfa8

NJS_INDEX_GLOBAL_RETVAL - is not publicly available in njs.h

@hongzhidao

Here is my version of njs_vm_invoke patch
https://gist.github.com/xeioex/d266dfa24611ece3ab35e5b88881cfa8

NJS_INDEX_GLOBAL_RETVAL - is not publicly available in njs.h

No problem.

@hongzhidao

./build/njs /tmp/main.js
SyntaxError: Unexpected token "d" in /tmp/error.js:3

I like the format.

https://gist.github.com/hongzhidao/307616529029564525960fdd48441dfe

njs_vm_compile(njs_vm_t vm, nxt_str_t *source, u_char *start, u_char *end)

file name should be provided through njs_vm_opt_t (as well as search pathes).

njs_parser_error()

to get rid of njs_parser_error_t.

njs_parser_error(njs_vm_t *vm, njs_parser_t *parser, njs_value_type_t type, const char *fmt,
    va_list args)
{
...
njs_exception_error_create(vm, type, ...);
}

njs_parser_syntax_error
njs_parser_ref_error

I think, it is better to make njs_parser_error() public. and replace the functions with macros.

define njs_parser_syntax_error() njs_parser_error(vm, parser, type, fmt, ##__VA_ARGS__)

@xeioex

Patch: https://gist.github.com/hongzhidao/9d0499aa20dade21c5020c415b0244c0

  1. Introduced njs_parser_error().
  2. Improved njs shell, it seems better to refactor it and 'module' together.
  1. How about renaming njs_exception_error_create as njs_error_new?

@hongzhidao

How about renaming njs_exception_error_create as njs_error_new?

+1

char *file;

I would make it nxt_str_t

nxt_str_t source;

Maybe it is better to name it file?

@hongzhidao

https://gist.github.com/hongzhidao/ed5fdc878ffa29dc4b0e7a00b0821dd6
https://gist.github.com/hongzhidao/f19fbd112803ac7d1f2246d7299e98ff

Looks good.
why not reorder patches, and use njs_error_new() in njs_parser_error?

@xeioex

Here's the new patch.
https://gist.github.com/hongzhidao/1c6dbfca40a92fd3efbb31605619eefa

why not reorder patches,

Should reorder.

use njs_error_new() in njs_parser_error

Remove duplicated (fmt, args).

@hongzhidao

Maybe also rename njs_exception_error_create -> njs_error_fmt_new?
And buf[256] -> buf[NXT_MAX_ERROR_STR]

@hongzhidao

BTW, we also need filenames in backtraces

cat line.js 
/*
**/(function(){throw Error();})()
./build/njs line.js 
Error
    at anonymous (:2)
    at main (native)

->

./build/njs line.js 
Error
    at anonymous (line.js:2)
    at main (line.js:2)

Get it.

@xeioex

Patches: reporting filename in generator and runtime errors.
https://gist.github.com/hongzhidao/58d7962d7f7d787b7e7a096c8c4c4fb6

About reporting errors at runtime phase. See the example first.

cat line.js 
/* */
function foo() {
    /* */
    (function() {
    /* */
        throw Error();
    })()
}
/*
 **/
foo();
./build/njs line.js 
Error
    at anonymous (:4)
    at foo (:2)
    at main (native)

Now, In Njs, it doesn't report what line of errors occur, it only reports what function triggers errors.
It needs more work if we want to report what line of errors, such as console.a = 'bang!';

These patches above only resolve reporting filename based on original function.

And there are two confusing places.

  1. In the mode of interactive shell: at anonymous (:4) vs at anonymous (4)
    Do we need to show ':'?
  2. For 'at main (native)', I don't think the main script is a native function.

One more optional patches, but I'm not sure it's ok.
https://gist.github.com/hongzhidao/e5c02cc16daa1c3e0c1b174b63a4f035

  1. Make lexer be a stack variable, parser module will be the same.
  2. Remove the duplicated field 'token_line', but I know it'll ask us to use lexer->line after doing 'next token' as soon as possible.

@hongzhidao

Thank you for your valuable contribution!

Now, In Njs, it doesn't report what line of errors occur, it only reports what function triggers errors.
It needs more work if we want to report what line of errors, such as console.a = 'bang!';

It is a known problem, but not for now, I think.

  1. In the mode of interactive shell: at anonymous (:4) vs at anonymous (4)
    Do we need to show ':'?

For the sake of regularity, I would report it as
at anonymous (shell:4)

2. For 'at main (native)', I don't think the main script is a native function.

Yes. This should be fixed, but it up to you, to fix it or not now.

The original point was, once we have filenames, we should report them in every place we already report line numbers.

https://gist.github.com/hongzhidao/58d7962d7f7d787b7e7a096c8c4c4fb6#file-generator-and-runtime-01-patch-L203

once we have NXT_INT32_T_LEN, I would replace magic 10

https://gist.github.com/hongzhidao/58d7962d7f7d787b7e7a096c8c4c4fb6#file-generator-and-runtime-01-patch-L249

I would replace sprintf with nxt_sprintf()

For the sake of regularity, I would report it as
at anonymous (shell:4)

Agree.

Yes. This should be fixed, but it up to you, to fix it or not now.

To be later.

once we have NXT_INT32_T_LEN, I would replace magic 10
I would replace sprintf with nxt_sprintf()

OK

@xeioex

Fixed njs_builtin_match_native_function().
https://gist.github.com/hongzhidao/ee6b5966118bbaa34de903b9892fccad

In njs_builtin_match_native_function.

len = obj->name.length + string.length + sizeof(".");
vs
len = obj->name.length + string.length + sizeof(".") - 1;

It works well if we use sprintf such as:

p += sprintf((char *) p, "    at %.*s (:%u)\n",

Since the return value of sprintf doesn't include '\0'.

Return value
       Upon successful return, these functions return the number of characters printed (not including the trailing '\0' used to end output  to
       strings).

But it doesn't work if we use nxt_sprintf.

So

Patches: reporting filename in generator and runtime errors.
https://gist.github.com/hongzhidao/58d7962d7f7d787b7e7a096c8c4c4fb6

I'll rewrite the code based on the fixed patch. Please confirm the fixed patch first.

Patches (2): reporting filename in generator and runtime errors.
https://gist.github.com/hongzhidao/755d9a991ef7e0bd12fdc246354aaf20

@hongzhidao

Fixed njs_builtin_match_native_function().
https://gist.github.com/hongzhidao/ee6b5966118bbaa34de903b9892fccad

In general, I like the patch. It needs some improvements, though.

Please, review my version of it:
https://gist.github.com/xeioex/9c879c05d303885a14f65362448b56ae

1) nxt_str_set() can be avoided.
2) njs_external_match() also needs to be fixed.

@xeioex No problem.

@hongzhidao

https://gist.github.com/hongzhidao/755d9a991ef7e0bd12fdc246354aaf20#file-generator-and-runtime-02-patch-L223

I think, it is better not to hardcode it here. Instead, interactive shell (njs_shell.c, njs_interactive_test.c) should set
vm_options->file to "shell".

So, we can expect vm_options->file to be always valid, but maybe empty (by default).

@xeioex

  1. updated
    https://gist.github.com/hongzhidao/755d9a991ef7e0bd12fdc246354aaf20

  2. Now, I keep the vm_options->file of shell/interactive empty.

    Instead, interactive shell (njs_shell.c, njs_interactive_test.c) should set
    vm_options->file to "shell".
    So, we can expect vm_options->file to be always valid, but maybe empty (by default).

diff -r 572451d1c711 njs/njs_shell.c
--- a/njs/njs_shell.c   Tue Feb 05 01:07:45 2019 +0800
+++ b/njs/njs_shell.c   Tue Feb 05 01:12:52 2019 +0800
@@ -217,6 +217,9 @@ main(int argc, char **argv)
     if (opts.file != NULL) {
         vm_options.file.start = (u_char *) opts.file;
         vm_options.file.length = strlen(opts.file);
+
+    } else {
+        vm_options.file = nxt_string_value("shell");
     }

     vm_options.init = !opts.interactive;
diff -r 572451d1c711 njs/test/njs_interactive_test.c
--- a/njs/test/njs_interactive_test.c   Tue Feb 05 01:07:45 2019 +0800
+++ b/njs/test/njs_interactive_test.c   Tue Feb 05 01:12:52 2019 +0800
@@ -281,6 +281,7 @@ njs_interactive_test(nxt_bool_t verbose)

         options.accumulative = 1;
         options.backtrace = 1;
+        options.file =  nxt_string_value("shell");

         vm = njs_vm_create(&options);
         if (vm == NULL) {
  1. In shell, it looks good.
>> 'str'.replace(/t/g, function(m) {return m.a.a})
TypeError: cannot get property "a" of undefined
    at anonymous (shell:1)
    at String.prototype.replace (native)
    at main (native)
  1. In interactive, it looks kind of unclear.
njs_interactive("function f() { return b;
") failed: "SyntaxError: Unexpected end of input in 1" vs "SyntaxError: Unexpected end of input in shell:1"
  1. In interactive, it looks kind of unclear.

I think, we can leave interactive as is (without a file).

SyntaxError: Unexpected end of input in :1

OK

  1. Help review the patch and append the diff of njs_shell.c
    https://gist.github.com/hongzhidao/755d9a991ef7e0bd12fdc246354aaf20

  2. What about this?

    https://gist.github.com/hongzhidao/e5c02cc16daa1c3e0c1b174b63a4f035
    Make lexer be a stack variable, parser module will be the same.
    Remove the duplicated field 'token_line', but I know it'll ask us to use lexer->line after doing 'next token' as soon as possible.

@hongzhidao

https://github.com/nginx/njs/commit/debed7898ce4205b16a39d213c1189a87fc683c0#diff-580361d98052db6f401a368522c6226fR2660

It seems to me,

if (p > end - width) {
        p = end - width;
    }

is not needed here. nxt_vsprintf() handles it (https://github.com/nginx/njs/blob/master/nxt/nxt_sprintf.c#L112).

njs_generate_syntax_error

I think, it is a good idea to make it similar to njs_parser_error().
Improved version: https://gist.github.com/xeioex/f3590c70011e59fcf2ed3fcc9f3da0ca

is not needed here. nxt_vsprintf() handles it

I prefer to reverse it. In the case.
njs_parser_error(..., "2048 length string");
We should ensure this part '" in %V:%uD", &lexer->file, lexer->line' be reserved.

I think, it is a good idea to make it similar to njs_parser_error().

Agree, except the same thing in njs_parser_error.

We should ensure this part '" in %V:%uD", &lexer->file, lexer->line' be reserved.

Oh, I see. Agree.

@hongzhidao

BTW,

I am looking to node.js backtrace format

> ({Boolean(1):1})
SyntaxError: Unexpected number
    at Object.exports.createScript (vm.js:24:10)
    at REPLServer.defaultEval (repl.js:225:25)
    at bound (domain.js:287:14)
    at REPLServer.runBound [as eval] (domain.js:300:12)
    at REPLServer.<anonymous> (repl.js:417:12)
    at emitOne (events.js:82:20)
    at REPLServer.emit (events.js:169:7)
    at REPLServer.Interface._onLine (readline.js:210:10)
    at REPLServer.Interface._line (readline.js:549:8)
    at REPLServer.Interface._ttyWrite (readline.js:826:14)

It seems to me, we should only include file name only, not full path (because it can be quite long).

Good point, I think we can consider it with 'module', we also need to process the absolute/relative path and filename.

@xeioex

Maybe we can make file name shorter by the simple way.

diff -r 4dfc639287f0 njs/njs_shell.c
--- a/njs/njs_shell.c   Tue Feb 05 01:07:45 2019 +0800
+++ b/njs/njs_shell.c   Tue Feb 05 04:02:13 2019 +0800
@@ -70,6 +70,7 @@ static nxt_int_t njs_interactive_shell(n
 static nxt_int_t njs_process_file(njs_opts_t *opts, njs_vm_opt_t *vm_options);
 static nxt_int_t njs_process_script(njs_console_t *console, njs_opts_t *opts,
     const nxt_str_t *script);
+static void njs_file_name(nxt_str_t *dst, char *src);
 static nxt_int_t njs_editline_init(void);
 static char **njs_completion_handler(const char *text, int start, int end);
 static char *njs_completion_generator(const char *text, int state);
@@ -215,8 +216,7 @@ main(int argc, char **argv)
     nxt_memzero(&vm_options, sizeof(njs_vm_opt_t));

     if (opts.file != NULL) {
-        vm_options.file.start = (u_char *) opts.file;
-        vm_options.file.length = strlen(opts.file);
+        njs_file_name(&vm_options.file, opts.file);
     }

     vm_options.init = !opts.interactive;
@@ -640,6 +640,26 @@ njs_process_script(njs_console_t *consol
 }


+static void
+njs_file_name(nxt_str_t *dst, char *src)
+{
+    char  *p;
+    size_t  length;
+
+    length = strlen(src);
+
+    for (p = src + length; p >= src; p--) {
+        if (*p == '/') {
+            p++;
+            break;
+        }
+    }
+
+    dst->start = (u_char *) p;
+    dst->length = length - (p - src);
+}
+
+
 static nxt_int_t
 njs_editline_init(void)
 {
@@ -1071,4 +1091,3 @@ lvlhsh_pool_free(void *pool, void *p, si
 {
     nxt_mp_free(pool, p);
 }
-

@hongzhidao

Agree. Maybe it is even better
1) njs_file_name -> nxt_file_name
2) nxt_file.c (and other file primitives can live here)

@xeioex

@hongzhidao

Agree. Maybe it is even better

  1. njs_file_name -> nxt_file_name
  2. nxt_file.c (and other file primitives can live here)

Improved shell.
https://gist.github.com/hongzhidao/db056082668c3caff25ed2d6b6354871

Now, It's ready to go on 'module' implementation.
Compare to the previous design, I want to use 'export default' instead of 'export'.
See the demo first.

// main.js
import lib from '/foo.js';
lib.foo();
lib.foo();
lib.foo();
// lib.js
function foo() {
    console.log('lib foo');
}
export default {foo}



md5-b67ab6aceaf9e56011afd8d271dc5ddf



'lib foo'
'lib foo'
'lib foo'

Some rules:

  1. We suggest using 'namespace' as the mechanism njs module.
  2. We should follow the standard spec of js, but choice the simple way.
  3. With import, we use the format. import {libname} from "libpath"
  4. With export, we use the format. export default {...}

Some notes on 'export default'.

  1. When we use the format import x from y, the keyword 'export' should be required in modules. So we can think the export statement should be required in modules.
  2. It's not allowed that there are two more export statements in a module.
  3. We limit the value type of export to object. export default {object}
  4. Thus we can regard 'export' as 'return'. (very similar)
  5. Contrary to hoist of 'import'. 'export' will be moved to bottom-most.

@hongzhidao

Some rules:

  1. We suggest using 'namespace' as the mechanism njs module.
  2. We should follow the standard spec of js, but choice the simple way.
  3. With import, we use the format. import {libname} from "libpath"
  4. With export, we use the format. export default {...}

Some notes on 'export default'.

  1. When we use the format import x from y, the keyword 'export' should be required in modules. So we can think the export statement should be required in modules.
  2. It's not allowed that there are two more export statements in a module.
  3. We limit the value type of export to object. export default {object}
  4. Thus we can regard 'export' as 'return'. (very similar)
  5. Contrary to hoist of 'import'. 'export' will be moved to bottom-most.

Agree. import x from y looks much better.

@hongzhidao

One more requirement, that seems important
1) Ability to import other modules from a module (see 16.8.4).

hi @xeioex @hongzhidao
maybe is better to fix results as #90
that will provide a test base to continue

@xeioex

@hongzhidao

One more requirement, that seems important

  1. Ability to import other modules from a module (see 16.8.4).

Yep, modules can't access global variables.

@drsm

hi @xeioex @hongzhidao
maybe is better to fix results as #90
that will provide a test base to continue

Can you add tests to be referenced, it'll be helpful.

@xeioex

./build/njs /tmp/main.js
'lib1 start'
'lib1 end'
'lib3 start'
'lib3 end'
'lib2 start'
'lib3 crypto:b86fc6b051f63d73de262d4c34e3a0a9'
'lib2 end'
'main start'
'main crypto:b86fc6b051f63d73de262d4c34e3a0a9'
'lib1 crypto:b86fc6b051f63d73de262d4c34e3a0a9'
'lib2 crypto:b86fc6b051f63d73de262d4c34e3a0a9'
'main end'
cat /tmp/main.js
console.log('main start');

import lib1 from '/tmp/lib1.js';
import lib2 from '/tmp/lib2.js';

import crypto from 'crypto';
var h = crypto.createHash('md5');
var s = h.update('AB').digest('hex')
console.log('main crypto:' + s);

lib1.test();
lib2.test();

console.log('main end');



md5-dbb88504e429401cc83ba8c6db5ad701



cat /tmp/lib1.js
export default {test: test}

console.log('lib1 start')

function test() {
    var h = crypto.createHash('md5');
    var s = h.update('AB').digest('hex')
    console.log('lib1 crypto:' + s);
}

import crypto from 'crypto';

console.log('lib1 end')



md5-dbb88504e429401cc83ba8c6db5ad701



cat /tmp/lib2.js
console.log('lib2 start');

function test() {
    var h = crypto.createHash('md5');
    var s = h.update('AB').digest('hex')
    console.log('lib2 crypto:' + s);
}

import crypto from 'crypto';
import lib3 from '/tmp/lib3.js';

lib3.test('AB');

console.log('lib2 end');

export default {test: test};



md5-dbb88504e429401cc83ba8c6db5ad701



cat /tmp/lib3.js
console.log('lib3 start')

import crypto from 'crypto';

function test(s) {
    var h = crypto.createHash('md5');
    var s = h.update(s).digest('hex')
    console.log('lib3 crypto:' + s);
}

console.log('lib3 end')

export default {test: test}

Here's the patch.
https://gist.github.com/hongzhidao/c35406a7156f7aa6c9db25efcc93202e

Rules
With import:

  1. Use the format import x from y, including native module. (done)
  2. Import is only allowed in global and module scope. (done)

With export:

  1. Use the format. export default {...} (done)
  2. When we use the format import x from y, the keyword 'export' should be required in modules. So we can think the export statement should be required in modules. (done)
  3. Limit the value type of export to object. export default {object} (done)
  4. Regard 'export' as 'return'. (very similar) (done)
  5. It's not allowed that there are two more export statements in a module. (todo)
  6. Contrary to hoist of 'import'. 'export' will be moved to bottom-most. (todo)
  7. module path search. (todo)

Unit tests: (todo)

Welcome to add.

@hongzhidao

Great work!

I am going to look into it in details today.

Unit tests: (todo)

will do.

+#define njs_vm_call(vm, function, args, nargs)

  • njs_vm_invoke(vm, function, args, nargs, NJS_INDEX_GLOBAL_RETVAL)

will not compile with nginx modules. NJS_INDEX_GLOBAL_RETVAL -is not available in njs.h

2. Import is only allowed in global and module scope. (done)

what about importing non-native modules from a module?

@xeioex

what about importing non-native modules from a module?

It supports now, see the demo /tmp/lib2.js.

will not compile with nginx modules. NJS_INDEX_GLOBAL_RETVAL -is not available in njs.h

My mistake.

I am going to look into it in details today.

I'm on 'export statement', the patch maybe updated later.
I'll introduce two functions: njs_parser_import_hoist(), njs_parser_export_sink().

@xeioex, updated.
https://gist.github.com/hongzhidao/c35406a7156f7aa6c9db25efcc93202e
The function njs_parser_export_sink() is still not concise.

@hongzhidao

https://gist.github.com/hongzhidao/c35406a7156f7aa6c9db25efcc93202e

1) modules_list -> modules

    nxt_array_t              *modules; 
    nxt_lvlhsh_t             modules_hash;

2) njs_module_invoke() -> njs_module_start() or njs_module_load()

3) https://gist.github.com/hongzhidao/c35406a7156f7aa6c9db25efcc93202e#file-njs-module-01-patch-L326

free() -> nxt_mp_free()

4)

can be moved inside njs_module.c. Probably, we can get rid of them completely.

#define NJS_MODULE_BODY_START   "function() {"                                                                                   
#define NJS_MODULE_BODY_END     "}" 

5) njs_module_read

fd = open(file, O_RDONLY);
if (fd == -1) {
return NXT_ERROR;
}

exception is needed here, including strerror() see njs_fs_read_file().

A bunch of small fixes and improvements (including 1-4)
https://gist.github.com/xeioex/11395b51a23b4c8344a86bfcb4f4110d

@xeioex

Updated patch (still including 1-4).
https://gist.github.com/hongzhidao/17867cd5e84fdaad2d50140be94c4d92

I'm not sure the definite way of importing native/non-native modules yet.

  1. How do we deal with 'require()'? Retain or Delete?
  2. Does import keyword support native module?
    If yes, we need a way to difference native and non-native, such as by the name.
import fs from 'fs'; // thow 'Cannot find module...' if not found.
import foo from './foo.js'; // throw 'strerror()...' if open/fstat failed.

function is_native_module(name) {
    if (name[0] == '/' or name[0] == '.') {
       return false;
    }
    return true;
}

BTW, is it an issue?
There are three files left after running 'expect' in '.../njs' director.

diff -r eb099f4faf40 njs/test/njs_expect_test.exp
--- a/njs/test/njs_expect_test.exp  Mon Feb 11 18:15:43 2019 +0300
+++ b/njs/test/njs_expect_test.exp  Sun Feb 10 18:59:56 2019 +0800
@@ -613,3 +613,8 @@ njs_test {
     {"var crypto = require('crypto')\r\n"
      "undefined\r\n"}
 } "-s"
+
+# cleanup
+exec rm -fr njs_test_file
+exec rm -fr njs_test_file2
+exec rm -fr njs_wo_file

@hongzhidao

  1. How do we deal with 'require()'? Retain or Delete?

I think we have to retain it for a while for backward compatibility. Will be removed sometimes in the future.

2. Does import keyword support native module?

module specifiers

module specifiers, strings that are either:
Relative paths ('../model/user'): these paths are interpreted relatively to the location of the importing module. The file extension .js can usually be omitted.
Absolute paths ('/lib/js/helpers'): point directly to the file of the module to be imported.
Names ('util'): What modules names refer to has to be configured.

So, I think we can regard built-in modules as names.

BTW, is it an issue? There are three files left after running 'expect' in '.../njs' director

Yes. This can be improved. Once we have paths, we can make temporary directories, and store all the files there.

@xeioex
Here's the patch with simple tests.
https://gist.github.com/hongzhidao/0be1692c6b10650ae69df74bf5dec3f3

module specifiers

module specifiers, strings that are either:
Relative paths ('../model/user'): these paths are interpreted relatively to the location of the importing module. The file extension .js can usually be omitted.
Absolute paths ('/lib/js/helpers'): point directly to the file of the module to be imported.
Names ('util'): What modules names refer to has to be configured.

So, I think we can regard built-in modules as names.

Good idea. So is what I showed right?

import fs from 'fs'; // thow 'Cannot find module...' if not found.
import foo from './foo.js'; // throw 'strerror()...' if open/fstat failed.

function is_native_module(name) {
    if (name[0] == '/' or name[0] == '.') {
       return false;
    }
    return true;
}

Yes. This can be improved. Once we have paths, we can make temporary directories, and store all the files there.

Can you support testing the format 'njs file' in njs_expect_test.exp? And I can't solve how to match multiline output. Need help!

@hongzhidao

proc njs_write {file data}

I think, we can simply commit test njs files because they are immutable. into njs/test/

@hongzhidao

proc njs_write {file data}

I think, we can simply commit test njs files because they are immutable. into njs/test/

Agree.

@xeioex

With module name.

  1. What about processing it like this?
import fs from 'fs'; // thow 'Cannot find module...' if not found.
import foo from './foo.js'; // throw 'strerror()...' if open/fstat failed.

function is_native_module(name) {
    if (name[0] == '/' or name[0] == '.') {
       return false;
    }
    return true;
}
  1. What about making the file extension .js is mandatory for readability?
    > Relative paths ('../model/user'): these paths are interpreted relatively to the location of the importing module. The file extension .js can usually be omitted.
    Absolute paths ('/lib/js/helpers'): point directly to the file of the module to be imported.

It doesn't show whether the file extension can be omitted in absolute paths.

import fs from 'fs'; // thow 'Cannot find module...' if not found.
import foo from './foo.js'; // throw 'strerror()...' if open/fstat failed.

Agree.

function is_native_module(name) {

I am not sure we need special handling here. At the moment we load modules, built-in modules have to be in place.

  1. What about making the file extension .js is mandatory for readability?

not sure. While searching for modules, we can simply state that we try all combinations of path_prefix and a provided file name.

'/path1/' + 'my.js'
'/path1/modules/' + 'my.js'
...

OK, maybe we can discuss based the next patch.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

drsm picture drsm  路  4Comments

reyou picture reyou  路  5Comments

xbb123 picture xbb123  路  4Comments

drsm picture drsm  路  3Comments

drsm picture drsm  路  5Comments