Njs: arrow function support.

Created on 25 Feb 2019  Â·  178Comments  Â·  Source: nginx/njs

(arg1,arg2..) => {<function body>}

(according to, is the most popular feature of ES6).

https://hacks.mozilla.org/2015/06/es6-in-depth-arrow-functions/
Differences from function:

1.
Identifier => Expression
skipping typing function and return, as well as some parentheses, braces, and a semicolon.
return is implied.

    2.
// ES6
$("#confetti-btn").click(event => {
  playTrumpet();
  fireConfettiCannon();
});

Note that an arrow function with a block body does not automatically return a value. Use a return statement for that.
3.
// create a new empty object for each puppy to play with

var chewToys = puppies.map(puppy => {});   // BUG!
var chewToys = puppies.map(puppy => ({})); // ok

Unfortunately, an empty object {} and an empty block {} look exactly the same. The rule in ES6 is that { immediately following an arrow is always treated as the start of a block, never the start of an object. The code puppy => {} is therefore silently interpreted as an arrow function that does nothing and returns undefined.

  1. There is one subtle difference in behavior between ordinary function functions and arrow functions. Arrow functions do not have their own this value. The value of this inside an arrow function is always inherited from the enclosing scope.

  2. 5.

There’s one more minor difference between arrow and non-arrow functions: arrow functions don’t get their own arguments object, either. Of course, in ES6, you’d probably rather use a rest parameter or default value anyway.

Grammar:

ArrowFunction :
    ArrowParameters [no LineTerminator here] => ConciseBody
ArrowParameters :
    BindingIdentifier
    CoverParenthesizedExpressionAndArrowParameterList
ConciseBody :
    [lookahead ≠ { ] AssignmentExpression
    { FunctionBody }
ES6 feature help wanted

Most helpful comment

Looks good, great work.

All 178 comments

@xeioex I try it.

@xeioex @drsm

Fixed parsing of regexp literals.
https://gist.github.com/hongzhidao/b7c1a0671d230ac0680130f6f2b1b950

@xeioex

My thoughts on implementing arrow function:

  1. We need to introduce LL(k) instead of LL(1) which is using now. This feature is like the following format.
// same function header
void test(...);    // declaration.
void test(...) {}  // define.
  1. Actually, there is simply grammar text in njs. njs_lexer_peek_token can be simplified by LL(k).
label:|name
  1. So LL(k) is the main idea of implementing arrow function.

Here's the draft of LL(k), it's not clear now.
https://gist.github.com/hongzhidao/99abf7045a8b5164f76a96709eb737b5

Note:

  1. Lookahead tokens are always existed, njs_lexer_next_token is applied with last token(njs_lexer_token_t).
  2. All the fields related to njs_lexer_t such as text, number, token_line, etc used in njs_parser.c are associated to the free token(njs_lexer_token_t).
  3. Unit tests passed.
  4. arrow function has not been implemented yet, LL(k) should be introduced first.

Better naming and suggestions are welcome.

@hongzhidao

https://gist.github.com/hongzhidao/b7c1a0671d230ac0680130f6f2b1b950

It seems to me the patch is wrong:

Grammar: https://www.ecma-international.org/ecma-262/6.0/#sec-lexical-and-regexp-grammars

RegularExpressionLiteral ::
/ RegularExpressionBody / RegularExpressionFlags

RegularExpressionBody ::
RegularExpressionFirstChar RegularExpressionChars

RegularExpressionChars ::
[empty]
RegularExpressionChars RegularExpressionChar

RegularExpressionFirstChar ::
RegularExpressionNonTerminator but not one of * or \ or / or [
RegularExpressionBackslashSequence
RegularExpressionClass

RegularExpressionChar ::
RegularExpressionNonTerminator but not one of \ or / or [
RegularExpressionBackslashSequence
RegularExpressionClass

RegularExpressionBackslashSequence ::
\ RegularExpressionNonTerminator

RegularExpressionNonTerminator ::
SourceCharacter but not LineTerminator

where

LineTerminator ::
<LF>
<CR>

So LineTerminator is specifically forbidden after a backslash (unlike string literals).

@hongzhidao

My thoughts on implementing arrow function:
We need to introduce LL(k) instead of LL(1) which is using now

I am not sure it can help us. What k should we take in general?

param1, param2, …, paramN) => { statements } 
(param1, param2, …, paramN) => expression
// equivalent to: => { return expression; }

the number of tokens before => can be arbitrary long.

Instead, I suggest to start from the grammar:

AssignmentExpression :
    ConditionalExpression
    YieldExpression
    **ArrowFunction**
    LeftHandSideExpression = AssignmentExpression
    LeftHandSideExpression AssignmentOperator AssignmentExpression

Take a look at njs_parser_assignment_expression() (it was written for ES5.1):

AssignmentExpression :
    ConditionalExpression
    LeftHandSideExpression = AssignmentExpression
    LeftHandSideExpression AssignmentOperator AssignmentExpression

I would start here ^^.

In case it is not enough, I would think how we can improve parser and lexer to do the following:

1) store current parser and lexer state
2) try grammar X if it fails restore state from 1), otherwise return X
3) try grammar Y if it fails restore state from 1), otherwise return Y
...

Specifically for arrow functions, in pseudocode:

njs_parser_assignment_expression():

    token = njs_parser_conditional_expression(vm, parser, token);                                                                
        if (nxt_slow_path(token <= NJS_TOKEN_ILLEGAL)) {                                                                             
            return token;                                                                                                            
        }

    // try ArrowFunction grammar

    if (token == '(' || token == NAME) {
        store = njs_parser_store(vm, parser);
        token = njs_parser_arrow_expression(vm, parser, token); 
       if (token > NJS_TOKEN_ILLEGAL) {
           return token;
       }

      njs_parser_rollback(vm, parser, store);
    }   

    for ( ;; ) {                                                                                                                 
        switch (token) {
           ...

@xeioex

So LineTerminator is specifically forbidden after a backslash (unlike string literals).

Can you add unit tests first?

I am not sure it can help us. What k should we take in general?

In case it is not enough, I would think how we can improve parser and lexer to do the following:
store current parser and lexer state
try grammar X if it fails restore state from 1), otherwise return X
try grammar Y if it fails restore state from 1), otherwise return Y

I think the both ways can archive it, thanks, I'll try them including for the LL(k) way.
For LL(k), its advantage is without side effect.

njs_parser_assignment_expression():

    token = njs_parser_conditional_expression(vm, parser, token);                                                                
        if (nxt_slow_path(token <= NJS_TOKEN_ILLEGAL)) {                                                                             
            return token;                                                                                                            
        }

    // try ArrowFunction grammar

    if (token == '(' || token == NAME) {
        lookahead(...); // many times
        if (match(arrow_function_gramma)) {
            token = njs_parser_arrow_expression(vm, parser, token); // In njs_parser_arrow_expression, we do the same thing as past, njs_parser_token().
            if (token > NJS_TOKEN_ILLEGAL) {
                return token;
           }
       }
    }   

    for ( ;; ) {                                                                                                                 
        switch (token) {
           ...

Anyway, it's better to decide after the complete patch.

@hongzhidao

Can you add unit tests first?

nodejs:

> eval("/a\n/")
SyntaxError: Invalid regular expression: missing /
> eval("/a\r/")
SyntaxError: Invalid regular expression: missing /
> eval("/a\q/")
/aq/
> eval("/a\\n/")
/a\n/
> eval("/a\\r/")
/a\r/
> eval("/a\\q/")
/a\q/
> eval("/a\\\n/")
SyntaxError: Invalid regular expression: missing /
> eval("/a\\\r/")
SyntaxError: Invalid regular expression: missing /
> eval("/a\\\q/")
/a\q/

@xieoex

cat /tmp/test.js && ./build/njs /tmp/test.js
/abc

/
cat /tmp/test.js && node /tmp/test.js
/abc

/

/tmp/test.js:1
(function (exports, require, module, __filename, __dirname) { /abc
                                                              ^
SyntaxError: Invalid regular expression: missing /
    at Module._compile (module.js:439:25)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Function.Module.runMain (module.js:497:10)
    at startup (node.js:119:16)
    at node.js:945:3

@hongzhidao

Yes, the problem with parsing regexp literals exists. My point is: parsing of regexp literals is not the same as string literals with / delimeters.

Compare:

"blabla\
bla" // valid string "blablabla"

// invalid regexp
/blabla\
bla/   

@hongzhidao

I think the both ways can archive it
For LL(k), its advantage is without side effect.

Agree. Probably you are right, because njs_parser_rollback() is not easy to do. So peeking of arbitrary number of tokens is a better way.

@xeioex Help fix the regexp issue, I'm on LL(k).

@xeioex

Style.

# HG changeset patch
# User hongzhidao <[email protected]>
# Date 1551745974 -28800
# Node ID 1096977570942e418ee93e500166a953421ee19b
# Parent  195158f4a6a791bae680bd26842494a174b8940a
Style.

diff -r 195158f4a6a7 -r 109697757094 njs/njs_lexer.c
--- a/njs/njs_lexer.c   Thu Feb 28 20:34:47 2019 +0300
+++ b/njs/njs_lexer.c   Tue Mar 05 08:32:54 2019 +0800
@@ -244,7 +244,7 @@ static const njs_lexer_multi_t  njs_less
 };


-static const njs_lexer_multi_t  njs_less_equal_token[] = {
+static const njs_lexer_multi_t  njs_strict_equal_token[] = {
     { '=', NJS_TOKEN_STRICT_EQUAL, 0, NULL },
 };

@@ -268,7 +268,7 @@ static const njs_lexer_multi_t  njs_grea


 static const njs_lexer_multi_t  njs_assignment_token[] = {
-    { '=', NJS_TOKEN_EQUAL, 1, njs_less_equal_token },
+    { '=', NJS_TOKEN_EQUAL, 1, njs_strict_equal_token },
 };

@xeioex
It seems clear that using LL(k) implement arrow function, some details are still in process.
The naming I'm not satisfied yet. Need suggestions.

For LL(k), I'll support a few interfaces, including njs_lexer_token().

/* behavior as past.  */
njs_token_t njs_lexer_token();  

/* considering to njs_lexer_token(). */
void njs_lexer_rollback(); 

/*
 * used in name + label statement.
 * I'll use it instead of njs_lexer_peek_token().
 */
nxt_int_t  njs_lexer_lookahead(); 

/* 
 * considering to njs_lexer_lookahead()
 * in name + label, we need to back after njs_lexer_lookahead().
 */
void njs_lexer_xxxback();

/* 
 * peek token 
 * what about renaming as njs_lexer_peek() and introduced njs_parser_peek()?
 */
njs_token_t njs_lexer_peek_token();

for example:
1 + 2 + 3 + 4
njs_lexer_lookahead();  // in buffers: 1
njs_lexer_lookahead();  // in buffers: 1 +
njs_lexer_lookahead();  // in buffers: 1 + 2
njs_lexer_lookahead();  // in buffers: 1 + 2 +
token = njs_lexer_token(); // 1
token = njs_lexer_token(); // +
token = njs_lexer_peek_token(); // 2 (gets from buffers)
token = njs_lexer_token(); // 2
token = njs_lexer_peek_token(); // + (gets from buffers)
token = njs_lexer_peek_token(); // 3 (need lookahead)
token = njs_lexer_token(); // 3

So there would be two fields record the position of njs_lexer_token and njs_lexer_peek_token().
lexer->free: njs_lexer_token().
lexer->k: based on lexer->free.

Another important structure is njs_lexer_token_t.

typedef struct {
    njs_token_t                     token:16;
    uint32_t                        token_line;
    nxt_str_t                       text;
    double                          number;
    uint32_t                        key_hash;
} njs_lexer_token_t;


typedef struct {
    size_t                          free;  // better naming?
    size_t                          k;   // better naming?
    nxt_array_t                     *tokens;
     ...
} njs_lexer_t;

I considered to rename njs_token_t as njs_token_type_t, and rename njs_lexer_token_t as njs_token_t.
But it'll have a big change in njs_parser.c since we use njs_token_t in many places.

@hongzhidao

for example:
1 + 2 + 3 + 4
njs_lexer_lookahead();  // in buffers: 1
njs_lexer_lookahead();  // in buffers: 1 +
njs_lexer_lookahead();  // in buffers: 1 + 2
njs_lexer_lookahead();  // in buffers: 1 + 2 +
token = njs_lexer_token(); // 1
token = njs_lexer_token(); // +
token = njs_lexer_peek_token(); // 2 (gets from buffers)
token = njs_lexer_token(); // 2
token = njs_lexer_peek_token(); // + (gets from buffers)
token = njs_lexer_peek_token(); // 3 (need lookahead)
token = njs_lexer_token(); // 3

In general, I like the idea. But, I think we can improve (simplify) lexer API a bit.

njs_lexer_token()
1) looks at preread buffer first, if it is not empty, remove the first token (preread[0]) from it and returns the token. (It seems, we need a queue here, not an array)
2) otherwise the current behavior.

njs_lexer_peek_token(njs_lexer_t *lexer, size_t offset)
1) looks at preread buffer first, if preread[offset] exists,
return preread[offset];
2) otherwise reads tokens ahead, and stores intermediate tokens in the preread buffer.
return preread[offset];

usage example:

njs_lexer_match(lexer, TARGET)

offset = 0;
while (1) {
    token = njs_lexer_peek_token(lexer, offset);
    if (token <= NJS_TOKEN_ILLEGAL) {
       return -1;
    }

   if (token == TARGET) {
      return offset;
   }
    offset++;
}
// try ArrowFunction grammar

pos = njs_lexer_match(lexer, ')');
if (pos > 0 && njs_lexer_peek_token(lexer, pos + 1) == '=>') {
    token = njs_parser_arrow_expression(vm, parser, token);
   ....
}

The main idea:
1) to avoid tracking positions in lexer (size_t k, size_t free).
2) to avoid rollback functions.

@xeioex

arrow function supports.
https://gist.github.com/hongzhidao/31e9fc344968b879c59882d93d79a721

It's not finished yet, but it's ready to discuss.

For the format:

(((a) => {}))
  1. I think it's better to put njs_parser_arrow_expression in njs_parser_terminate().

    AssignmentExpression :
    ConditionalExpression
    YieldExpression
    ArrowFunction
    LeftHandSideExpression = AssignmentExpression
    LeftHandSideExpression AssignmentOperator AssignmentExpression

  2. This needs to be adjusted, but I'm not clear now.

    // try ArrowFunction grammar
    pos = njs_lexer_match(lexer, ')');
    if (pos > 0 && njs_lexer_peek_token(lexer, pos + 1) == '=>') {
    token = njs_parser_arrow_expression(vm, parser, token);
    ....
    }

  3. I use array now.

    looks at preread buffer first, if it is not empty, remove the first token (preread[0]) from it and returns the token. (It seems, we need a queue here, not an array)

We need to consider /, we need to stop preread if meet / and END. It also happens in njs_lexer_match.

See tomorrow :)

some tests:

var f = () => { return this; }; f() == this;
var o = {}, f = () => { return this; }; f.call(o) == this;
var o = {}, f = () => { return this; }; f.bind(o)() == this;
var o = { f: function() { return () => { return this; } } }; o == o.f()()
var a = {}, o = { f: function() { return () => { return this; }; } }; a == o.f.call(a)()

update:

> var f = () => { return arguments; }
undefined
> f(1,2,3)
ReferenceError: arguments is not defined
    at f (repl:1:17)
> var ff = function() { return f(); }
undefined
> ff(1,2,3)
ReferenceError: arguments is not defined
    at f (repl:1:17)
    at ff (repl:1:30)
> var fff = function() { return () => { return arguments; }; }
undefined
> fff(1,2,3)()
[Arguments] { '0': 1, '1': 2, '2': 3 }

@xeioex

Updated.
https://gist.github.com/hongzhidao/31e9fc344968b879c59882d93d79a721

And it's better to implement these formats first.

(param1, param2, …, paramN) => { statements } 

// Parentheses are optional when there's only one parameter name:
(singleParam) => { statements }
singleParam => { statements }

// The parameter list for a function with no parameters should be written with a pair of parentheses.
() => { statements }

Then I'll consider the format.

(param1, param2, …, paramN) => expression
// equivalent to: => { return expression; }

@drsm thanks.
Now, I focus on the parser correctness, I'll improve this feature by your tests.

BTW

  1. which CLI do you use?
> var f = () => { return arguments; }
undefined
> f(1,2,3)
ReferenceError: arguments is not defined
    at f (repl:1:17)
> var ff = function() { return f(); }
undefined
> ff(1,2,3)
ReferenceError: arguments is not defined
    at f (repl:1:17)
    at ff (repl:1:30)
> var fff = function() { return () => { return arguments; }; }
undefined
> fff(1,2,3)()
[Arguments] { '0': 1, '1': 2, '2': 3 }
  1. Are the variables of this and arguments special in arrow function? @xeioex

@hongzhidao

Are the variables of this and arguments special in arrow function? @xeioex

yes, see 4. and 5.

@hongzhidao
I'm using:

$ node --version
v10.15.2
  1. Are the variables of this and arguments special in arrow function?

Yes, they came from an outer scope, and also this can't be changed in runtime by using call, apply, bind on an arrow function.

@xeioex

Are the variables of this and arguments special in arrow function? @xeioex
yes, see 4. and 5.

The same question in module.
Are the variables of this and arguments special in module?

@hongzhidao

The issue with this and arguments in modules is much less nagging. But for arrow functions correct behavior of this is paramount.

@xeioex

1.njs_lexer_token()

looks at preread buffer first, if it is not empty, remove the first token (preread[0]) from it and returns the token. (It seems, we need a queue here, not an array)
otherwise the current behavior.

In LL(k), I think the preread tokens should be always exists.
njs_lexer_token and njs_lexer_peek_token take token from the preread buffers.

2.The njs_lexer_match() and the newly parameter offset in njs_lexer_peek_token() look good, I like it.

3.What I'm not sure are the related fileds (token, token_line, text, etc), they look handy but kind of redundancy. And now I use the last preread token in njs_lexer_next_token().

@hongzhidao

The same question in module.
Are the variables of this and arguments special in module?

if we define a module using () => { /*code*/ }, there will be no problem with arguments.
and global this should be set to undefined

@hongzhidao

In LL(k), I think the preread tokens should be always exists.

In this case, at what point preread is populated?

@hongzhidao

In LL(k), I think the preread tokens should be always exists.

In this case, at what point preread is populated?

Sorry, I can't get it.

njs_lexer_token()
otherwise the current behavior.

I get it now, it seems ok, but discuss this later.

The key point is whether change njs_lexer_next_token() (now setting the result of lexer in last the preread token) and still use the fields (token, token_line, text, number, etc) of njs_lexer_t in njs_parser.c.

See this:

diff -r 1f13e316a93c njs/njs_parser.h
--- a/njs/njs_parser.h  Wed Mar 06 23:11:45 2019 +0800
+++ b/njs/njs_parser.h  Thu Mar 07 01:13:01 2019 +0800
@@ -221,11 +221,7 @@ typedef struct {
     size_t                          pos;
     nxt_array_t                     *preread;

-    njs_token_t                     token:16;
-    uint32_t                        token_line;
-    uint32_t                        key_hash;
-    nxt_str_t                       text;
-    double                          number;
+    njs_lexer_token_t               *token;  // points to one of preread tokens.

     uint8_t                         property;      /* 1 bit */
     njs_token_t                     property_token:16;

And rename njs_token_t as njs_token_type_t, njs_lexer_token_t as njs_token_t.

@hongzhidao

The key point is whether change njs_lexer_next_token() (now setting the result of lexer in last the preread token) and still use the fields (token, token_line, text, number, etc) of njs_lexer_t in njs_parser.c.

Agree. I do not like these fields (as lexer properties) too.

-    njs_token_t                     token:16;
-    uint32_t                        token_line;
-    uint32_t                        key_hash;
-    nxt_str_t                       text;
-    double                          number;
+    njs_lexer_token_t               *token;  // points to one of preread tokens.

looks good.

And rename njs_token_t as njs_token_type_t, njs_lexer_token_t as njs_token_t.

I am not sure. If njs_lexer_token_t is mostly used internally inside lexer functions, it is OK not to renamed it to njs_token_t.

@xeioex

Take a look.

diff -r bfd57048376e njs/njs_parser.c
--- a/njs/njs_parser.c  Wed Mar 06 19:39:21 2019 +0300
+++ b/njs/njs_parser.c  Thu Mar 07 01:55:11 2019 +0800
@@ -77,6 +77,12 @@ static njs_token_t njs_parser_unexpected
 #define njs_parser_chain_top_set(parser, node)                      \
     (parser)->scope->top = node

+#define njs_parser_text(parser)                                     \
+    &(parser)->lexer->text
+
+#define njs_parser_key_hash(parser)                                     \
+    (parser)->lexer->key_hash
+

 nxt_int_t
 njs_parser(njs_vm_t *vm, njs_parser_t *parser, njs_parser_t *prev)
@@ -515,8 +521,8 @@ njs_parser_labelled_statement(njs_vm_t *
     njs_token_t     token;
     njs_variable_t  *label;

-    name = parser->lexer->text;
-    hash = parser->lexer->key_hash;
+    name = *njs_parser_text(parser);
+    hash = njs_parser_key_hash(parser);

     label = njs_label_find(vm, parser->scope, &name, hash);
     if (nxt_slow_path(label != NULL)) {
@@ -615,7 +621,7 @@ njs_parser_function_declaration(njs_vm_t
         if (token == NJS_TOKEN_ARGUMENTS || token ==  NJS_TOKEN_EVAL) {
             njs_parser_syntax_error(vm, parser, "Identifier \"%V\" "
                                     "is forbidden in function declaration",
-                                    &parser->lexer->text);
+                                    njs_parser_text(parser));
         }

         return NJS_TOKEN_ILLEGAL;
  1. Introduce some macros with these fields (token, ... number).
  2. Remove these fields from njs_lexer_t and add newly field lexer_token(even name it as token) to njs_lexer_t.
  3. Redefine njs_parser_{field}.

What do you think?

@hongzhidao

What do you think?

I like it! Looks good.

Maybe this can be a separate patch.
Will do.

@xeioex

Introduced parser macro.
https://gist.github.com/hongzhidao/daec3bd35b5dbdb264b27b104232673e

  1. It's better to process the patch after arrow function.
  2. It seems need keep njs_token_t for parser and introduce njs_lexer_token_t for LL(k).
  3. The fields (token, ... number) will be reserved in njs_lexer_next_token, then assgin to the last preread token, thus we only do the minumun change to add arrow function.

@xeioex

I think we can achieve arrow function by refactor.

https://gist.github.com/hongzhidao/e556fa5de2957624b48cbf7e8e2ccbec

  1. Introduced njs_lexer_token_t.
    This structure records the information related to the current token, be used in njs_parser.c.

  2. Improved njs_lexer_token in lexer.
    Allocated on heap, it'll be more flexible since we'll introduce LL(k).

So far, it works well in LL(1) mode.

  1. introduced preread tokens.

  2. improved njs_lexer_peek_token().

So far, the LL(k) is introduced.

  1. introduced njs_lexer_match().

  2. introduced arrow function.

@hongzhidao

  1. why not eliminate lexer->token complelety? and renamed lexer->lexer_token -> lexer->token ?

  2. double check for vm->parser
    https://gist.github.com/hongzhidao/e556fa5de2957624b48cbf7e8e2ccbec#file-ll-k-beta-patch-L655

+    parser = vm->parser;
+
+    if (parser != NULL && parser->lexer != NULL) {
+        lexer = parser->lexer;
  1. LL(k): introduced njs_lexer_token. except for 1 and style changes, looks good.
  1. why not eliminate lexer->token complelety? and renamed lexer->lexer_token -> lexer->token ?

I don't know how to process prev_token without lexer->token.

  1. double check for vm->parser

Agree.

  1. LL(k): introduced njs_lexer_token. except for 1 and style changes, looks good.

Style, will do.

@xeioex

Take a look again.

LL(k): introduced njs_lexer_token.
https://gist.github.com/hongzhidao/c0519a123190c56c3fff53d58167a6f5

@hongzhidao

1.

I don't know how to process prev_token without lexer->token.

why not make prev_token and token of type njs_lexer_token_t *?

2.

typedef struct {                                                                                                                 
    uint32_t                        token_line;                                                                                  
    uint32_t                        key_hash;                                                                                    
    nxt_str_t                       text;                                                                                        
    double                          number;                                                                                      
} njs_lexer_token_t;  

why it does not have njs_token_t token property?

  1. https://gist.github.com/hongzhidao/e556fa5de2957624b48cbf7e8e2ccbec#file-ll-k-beta-patch-L798

BTW, sizeof(njs_lexer_t) -> sizeof(njs_lexer_token_t)

+    lexer->lexer_token = nxt_mp_alloc(vm->mem_pool, sizeof(njs_lexer_t));
+    if (nxt_slow_path(lexer->lexer_token == NULL)) {
+        return NXT_ERROR;
+    }
  1. to support njs_lexer_rollback()
    u_char *start can be added to njs_lexer_token_t

so

#define njs_lexer_rollback(lexer)                                       \
    (lexer)->start = (lexer)->prev_token->start

@hongzhidao

https://gist.github.com/xeioex/a061bac7532b22dac829af2b420f1f33

  1. Indentation fixes
  2. error: variable ‘lexer’ set but not used
--- a/njs/njs_parser.c
+++ b/njs/njs_parser.c
@@ -1891,11 +1891,8 @@ njs_parser_terminal(njs_vm_t *vm, njs_pa
 {
     double             num;
     njs_ret_t          ret;
-    njs_lexer_t        *lexer;
     njs_parser_node_t  *node;

-    lexer = parser->lexer;

@hongzhidao

In order not to make double changes to njs_lexer_t struct, I suggest to merge the first patch with the second LL(k): Improved njs_lexer_token in lexer.

@hongzhidao

In order not to make double changes to njs_lexer_t struct, I suggest to merge the first patch with the second LL(k): Improved njs_lexer_token in lexer.

OK.

So the logic is:

  1. Record the infomation (including token) corresponding to the current token to the lexer->token with type njs_lexer_token_t.
  2. Use lexer->token in njs_parser.c.

@hongzhidao

Record the infomation (including token) corresponding to the current token to the lexer->token with type njs_lexer_token_t.

Yes.

Use lexer->token in njs_parser.c.

Yes, but preferably indirectly through njs_parser_token() which should still return njs_token_t.

@xeioex

For njs_lexer_token_t, my main idea is it should be generic, can be used in LL(1) and LL(k).
I agree that put token(njs_token_t) in njs_lexer_token_t, but prev_token seems better to put in njs_lexer_t, since in LL(k) it will be curious.
And I prefer lexer->lexer_token, if renamed lexer->token, we get token with lexer->token->token?
Is there any better naming? lexer->xxx->token.

@hongzhidao

I agree that put token(njs_token_t) in njs_lexer_token_t, but prev_token seems better to put in njs_lexer_t, since in LL(k) it will be curious.

Yes, prev_token should be in njs_lexer_t, but my point is, it should be also type njs_lexer_token_t *, to support rollback operations.

And I prefer lexer->lexer_token, if renamed lexer->token, we get token with lexer->token->token?
Is there any better naming? lexer->xxx->token.

njs_token_t in njs_lexer_token_t can be named as type. lexer->token is not used directly outside njs_lexer.c

njs_token_t in njs_lexer_token_t can be named as type. lexer->token is not used directly outside njs_lexer.c

Very nice.

typedef struct {
    njs_token_t                     type:16;
    uint32_t                        token_line;
    uint32_t                        key_hash;
    nxt_str_t                       text;
    double                          number;
} njs_lexer_token_t;


typedef struct {
    njs_token_t                     prev_token:16;
    njs_lexer_token_t               token;

    uint8_t                         property;      /* 1 bit */
    njs_token_t                     property_token:16;

    uint32_t                        line;
    nxt_str_t                       file;

    nxt_lvlhsh_t                    keywords_hash;

    u_char                          *start;
    u_char                          *prev_start;
    u_char                          *end;
} njs_lexer_t;

Right?

@hongzhidao

Right?

Almost. I see it this way:

typedef struct {
    njs_token_t                     type:16;
    u_char                          *start; // to support start rollback, but maybe text is enough, if text always points to lexer->start buff.
    uint32_t                        line;
    uint32_t                        key_hash;
    nxt_str_t                       text;
    double                          number;
} njs_lexer_token_t;


typedef struct {
    njs_lexer_token_t           *prev_token;
    njs_lexer_token_t           *token;

    uint8_t                             property;      /* 1 bit */
    njs_lexer_token_t            *property_token; // ?? maybe njs_token_t is enough, ideally it should be removed at all

    uint32_t                        line;
    nxt_str_t                       file;

    nxt_lvlhsh_t                    keywords_hash;

    u_char                          *start;
   // u_char                          *prev_start; removed
    u_char                          *end;
} njs_lexer_t;

// to support start rollback, but maybe text is enough, if text always points to lexer->start buff.
// ?? maybe njs_token_t is enough, ideally it should be removed at all

I think these can be processed well, on it.

@xeioex

It's still unclear, since token is full in njs_parser.c and njs_lexer.c, I plan to name it back, and we discuss this later.

njs_token_t                     type:16;
njs_lexer_token_t           *token;
typedef struct {
    njs_token_t                     type:16;
    u_char                          *start; // to support start rollback, but maybe text is enough, if text always points to lexer->start buff.
    uint32_t                        line;
    uint32_t                        key_hash;
    nxt_str_t                       text;
    double                          number;
} njs_lexer_token_t;


typedef struct {
    njs_lexer_token_t           *prev_token;
    njs_lexer_token_t           *token;

    uint8_t                             property;      /* 1 bit */
    njs_lexer_token_t            *property_token; // ?? maybe njs_token_t is enough, ideally it should be removed at all

    uint32_t                        line;
    nxt_str_t                       file;

    nxt_lvlhsh_t                    keywords_hash;

    u_char                          *start;
   // u_char                          *prev_start; removed
    u_char                          *end;
} njs_lexer_t;

We need to keep the word token correspond to njs_token_t.

@xeioex

Introduced njs_lexer_token_t.
https://gist.github.com/hongzhidao/c0519a123190c56c3fff53d58167a6f5

  1. lexer->prev_token
    It uses for reserve the last token involked by njs_lexer_token(). (njs_lexer_peerk_token does not need it.)So I think it belongs to njs_lexer_t. And it's helpful to see what's the previous token in parser.

  2. lexer->prev_start
    It uses for reserver the last positon parsed. And it's enough to rollback.

  3. lexer->property_token
    It uses for record another information about the current token. It's needed also.

  4. njs_lexer_rollback()
    It only do the thing lexer->start = lexer->prev_start, prev_token is unused.

  5. naming convention: lexer_token: njs_lexer_token_t, token: njs_token_t.

So as far, we only refactored out njs_lexer_token_t from njs_lexer_t.

@hongzhidao

https://gist.github.com/hongzhidao/c0519a123190c56c3fff53d58167a6f5#file-ll-k-wip-patch-L275

lt->text.start = lt->text.start;

Otherwise looks good. I plan to commit it today.

@hongzhidao

https://gist.github.com/xeioex/63321f3761821aaa0e6ddd6305ee700d

I plan to commit this patch (also moved lexer parts to njs_lexer.h).

@hongzhidao

typedef struct {                                                                                                                 
    njs_lexer_token_t               *lexer_token;          // the main field, should be first                                                                       

    njs_token_t                     prev_token:16; // moved prev fields closer together                                                                              
    u_char                          *prev_start;                                                                                 

    uint8_t                         property;      /* 1 bit */                                                                   
    njs_token_t                     property_token:16;                                                                           

    uint32_t                        line;                                                                                        
    nxt_str_t                       file;                                                                                        

    nxt_lvlhsh_t                    keywords_hash;                                                                               

    u_char                          *start;                                                                                      
    u_char                          *end;

@xeioex

Ready to introduce LL(k).

njs_lexer_token()
looks at preread buffer first, if it is not empty, remove the first token (preread[0]) from it and returns the token. (It seems, we need a queue here, not an array)
otherwise the current behavior.

I also like queue to manage preread tokens.
But I can't find a better way to access token by offset. (queue_each + count?)

My rough idea.

  1. The following works well, we needn't change their structure.
typedef struct {
    njs_token_t                     token:16;
    uint32_t                        token_line;
    uint32_t                        key_hash;
    nxt_str_t                       text;
    double                          number;
} njs_lexer_token_t;

typedef struct {
    njs_lexer_token_t               *lexer_token;
  1. Introduced a new structure to manage [token/preread]
typedef struct {
    njs_lexer_token_t  *token;
    nxt_array_t        *preread;
} njs_lexer_buffer_t;

njs_lexer_token:
    if (is_not_empty(preread)) {
        lexer->lexer_token = shift(preread);
    } else {
        lexer->lexer_token = token;
    }

njs_lexer_peek_token(offset):
   if (is_enough_for_offset(preread)) {
      multi_array_add(preread);
   }
   lt = preread[offset];

Here's the patch.
https://gist.github.com/hongzhidao/8a86bc967d0daa9879130feedf4b31e5

I think we can separate a patch related to njs_lexer_peek_token(..., offset) before introducing arrow function.

@xeioex

https://gist.github.com/xeioex/63321f3761821aaa0e6ddd6305ee700d
I plan to commit this patch (also moved lexer parts to njs_lexer.h).

It seems you forgot to add njs_lexer.h into Makefile.

@hongzhidao
https://gist.github.com/hongzhidao/8a86bc967d0daa9879130feedf4b31e5

I also like queue to manage preread tokens.
But I can't find a better way to access token by offset. (queue_each + count?)

I think it is OK. Queue is expected to be short and we do compilation only once.

2.

 typedef struct {
+    njs_lexer_token_t               token;
+    nxt_queue_t                     preread;  /* of njs_lexer_token_t */
+} njs_lexer_buffer_t;

why not simply use nxt_queue_t preread directly in njs_lexer_t?

Cannot we eliminate njs_lexer_token_t token also?

  1. njs_parser_try_arrow_function() -> njs_parser_match_arrow_function() ?

  2. more tests is needed for
    a) partial match (), ()=>, ()=>{
    b) this tests
    c) arguments and rest parameters tests
    d) no braces, no return support ()=>1, ()=>({})

njs_parser_try_arrow_function() -> njs_parser_match_arrow_function() ?

How about njs_parser_match_arrow_expression?

why not simply use nxt_queue_t preread directly in njs_lexer_t?

Agree.

Cannot we eliminate njs_lexer_token_t token also?

Not sure. On it.

Have any better naming? (token/preread)

Not sure. On it.

Looking at njs_lexer_token(njs_vm_t *vm, njs_lexer_t *lexer) currently too much special code.

To make it more regular, probably we need to ensure that preread queue is always populated by at least one token.

So checks like if (nxt_queue_is_empty(&buffer->preread)) can be removed.

@xeioex
Do you think this is an issue. It works well in LL(1), but not in LL(k).

njs_parser_object(njs_vm_t *vm, njs_parser_t *parser, njs_parser_node_t *obj)
{
    ...
    for ( ; ; ) {
        token = njs_parser_property_token(vm, parser);

        name.start = NULL;

        switch (token) {

        case NJS_TOKEN_CLOSE_BRACE:
            return njs_parser_token(parser);

        case NJS_TOKEN_NAME:
            name = *njs_parser_text(parser);
            hash = njs_parser_key_hash(parser);
            /* need token_line = njs_parser_token_line(parser) ?*/
            /* i: lexer_token1 */

            token = njs_parser_token(parser);    /* ii: lexer_token2->token_line == 0 */
            break;

        case NJS_TOKEN_NUMBER:
        case NJS_TOKEN_STRING:
        case NJS_TOKEN_ESCAPE_STRING:
            token = njs_parser_terminal(vm, parser, token);
            break;
        }
        ...
        if (name.start != NULL
            && (token == NJS_TOKEN_COMMA || token == NJS_TOKEN_CLOSE_BRACE)
            && lexer->property_token != NJS_TOKEN_THIS
            && lexer->property_token != NJS_TOKEN_GLOBAL_THIS)
        {
            expression = njs_parser_reference(vm, parser, lexer->property_token,
                                              &name, hash);  /* iii: used the lexer_token2->token_line */
            if (nxt_slow_path(expression == NULL)) {
                return NJS_TOKEN_ERROR;
            }

If yes, we need add token_line as a parameter of njs_parser_reference.

  1. It seems you forgot to add njs_lexer.h into Makefile.

  2. token_line.

diff -r 8e2cb4da5e46 njs/njs_parser.c
--- a/njs/njs_parser.c  Fri Mar 08 00:11:23 2019 +0800
+++ b/njs/njs_parser.c  Fri Mar 08 17:21:12 2019 +0800
@@ -53,7 +53,8 @@ static njs_token_t njs_parser_throw_stat
 static njs_token_t njs_parser_grouping_expression(njs_vm_t *vm,
     njs_parser_t *parser);
 static njs_parser_node_t *njs_parser_reference(njs_vm_t *vm,
-    njs_parser_t *parser, njs_token_t token, nxt_str_t *name, uint32_t hash);
+    njs_parser_t *parser, njs_token_t token, nxt_str_t *name, uint32_t hash,
+    uint32_t token_line);
 static nxt_int_t njs_parser_builtin(njs_vm_t *vm, njs_parser_t *parser,
     njs_parser_node_t *node, njs_value_type_t type, nxt_str_t *name,
     uint32_t hash);
@@ -2038,7 +2039,8 @@ njs_parser_terminal(njs_vm_t *vm, njs_pa
     default:
         node = njs_parser_reference(vm, parser, token,
                                     njs_parser_text(parser),
-                                    njs_parser_key_hash(parser));
+                                    njs_parser_key_hash(parser),
+                                    njs_parser_token_line(parser));

         if (nxt_slow_path(node == NULL)) {
             return NJS_TOKEN_ERROR;
@@ -2055,7 +2057,7 @@ njs_parser_terminal(njs_vm_t *vm, njs_pa

 static njs_parser_node_t *
 njs_parser_reference(njs_vm_t *vm, njs_parser_t *parser, njs_token_t token,
-    nxt_str_t *name, uint32_t hash)
+    nxt_str_t *name, uint32_t hash, uint32_t token_line)
 {
     njs_ret_t           ret;
     njs_value_t         *ext;
@@ -2217,7 +2219,7 @@ njs_parser_reference(njs_vm_t *vm, njs_p
     case NJS_TOKEN_NAME:
         nxt_thread_log_debug("JS: %V", name);

-        node->token_line = njs_parser_token_line(parser);
+        node->token_line = token_line;

         ext = njs_external_lookup(vm, name, hash);

@@ -2301,7 +2303,7 @@ njs_parser_builtin(njs_vm_t *vm, njs_par
 static njs_token_t
 njs_parser_object(njs_vm_t *vm, njs_parser_t *parser, njs_parser_node_t *obj)
 {
-    uint32_t           hash;
+    uint32_t           hash, token_line;
     nxt_str_t          name;
     njs_token_t        token;
     njs_lexer_t        *lexer;
@@ -2333,6 +2335,7 @@ njs_parser_object(njs_vm_t *vm, njs_pars
         case NJS_TOKEN_NAME:
             name = *njs_parser_text(parser);
             hash = njs_parser_key_hash(parser);
+            token_line = njs_parser_token_line(parser);

             token = njs_parser_token(parser);
             break;
@@ -2365,7 +2368,7 @@ njs_parser_object(njs_vm_t *vm, njs_pars
             && lexer->property_token != NJS_TOKEN_GLOBAL_THIS)
         {
             expression = njs_parser_reference(vm, parser, lexer->property_token,
-                                              &name, hash);
+                                              &name, hash, token_line);
             if (nxt_slow_path(expression == NULL)) {
                 return NJS_TOKEN_ERROR;
             }

@xeioex

Updated.
https://gist.github.com/hongzhidao/8a86bc967d0daa9879130feedf4b31e5

why not simply use nxt_queue_t preread directly in njs_lexer_t?
Cannot we eliminate njs_lexer_token_t token also?
njs_parser_try_arrow_function() -> njs_parser_match_arrow_function() ?

Done.

more tests is needed for

Will do.

Check the design in njs_lexer.c first.

@hongzhidao
1.

It seems you forgot to add njs_lexer.h into Makefile.

Thanks, will be fixed. BTW, many dependencies are broken in Makefile. Is a separate task.

  1. I would split arrow function changes into a separate patch.

  2. 3.
>> var materials = [ 'adas', 'wqwqwqwqw']
undefined
>> materials.map(material => material.length)
SyntaxError: Unexpected token "material" in shell:1
>> materials.map(material => {material.length})
[
 undefined,
 undefined
]
>> materials.map(material => {return material.length})
[
 4,
 9
]
  1. https://gist.github.com/hongzhidao/8a86bc967d0daa9879130feedf4b31e5#file-llk-03-08-patch-L122

trailing spaces in njs_lexer_token_push()

  1. njs_lexer_token_resolve() , name is too generic, I think.
    How about this:
if (lt->token == NJS_TOKEN_NAME) {
     return njs_lexer_token_name_resolve(lexer, lt);
}

return lt->token;

  1. > If yes, we need add token_line as a parameter of njs_parser_reference.

I am OK with it.


  1. > Check the design in njs_lexer.c first.

In general, I like the current design more. It looks simple.

@xeioex

BTW, many dependencies are broken in Makefile. Is a separate task.

What about referencing to UNIT.

-MMD -MF $NXT_BUILD_DIR/$nxt_dep -MT $NXT_BUILD_DIR/$nxt_obj

BTW, this can be a separate task. We can refactor nxt/auto close to the way of UNIT.

var materials = [ 'adas', 'wqwqwqwqw']

Still more work on njs_parser_arrow_expression, now we focus on njs_lexer.c first.

@hongzhidao
1.

It seems you forgot to add njs_lexer.h into Makefile.

Thanks, will be fixed. BTW, many dependencies are broken in Makefile. Is a separate task.

  1. I would split arrow function changes into a separate patch.
>> var materials = [ 'adas', 'wqwqwqwqw']
undefined
>> materials.map(material => material.length)
SyntaxError: Unexpected token "material" in shell:1
>> materials.map(material => {material.length})
[
 undefined,
 undefined
]
>> materials.map(material => {return material.length})
[
 4,
 9
]
  1. https://gist.github.com/hongzhidao/8a86bc967d0daa9879130feedf4b31e5#file-llk-03-08-patch-L122

trailing spaces in njs_lexer_token_push()

  1. njs_lexer_token_resolve() , name is too generic, I think.
    How about this:
if (lt->token == NJS_TOKEN_NAME) {
     return njs_lexer_token_name_resolve(lexer, lt);
}

return lt->token;

If yes, we need add token_line as a parameter of njs_parser_reference.

I am OK with it.

Check the design in njs_lexer.c first.

In general, I like the current design more. It looks simple.

@xeioex Help do these, I'm on njs_parser_arrow_expression().

njs_lexer_peek_token(njs_vm_t *vm, njs_lexer_t *lexer, size_t offset)
{
    size_t             i;
    nxt_queue_link_t   *link;
    njs_lexer_token_t  *lt;

    /* GCC and Clang complain about uninitialized lt. */
    lt = NULL;

    link = nxt_queue_first(&lexer->preread);

    for (i = 0; i <= offset; i++) {

        if (link != nxt_queue_tail(&lexer->preread)) {

            lt = nxt_queue_link_data(link, njs_lexer_token_t, link);

            if (lt->token == NJS_TOKEN_DIVISION || lt->token == NJS_TOKEN_END) {
                break;
            }

            link = nxt_queue_next(link);

        } else {

            lt = njs_lexer_token_push(vm, lexer);

            if (nxt_slow_path(lt == NULL)) {
                return NJS_TOKEN_ERROR;
            }
        }
    }

    return njs_lexer_token_resolve(lexer, lt);
}

Note:
Stop peeking while meet DIVISION and END, such as /abcd/gi (this's a regexp).

@xeioex

  1. return value
    > Note that an arrow function with a block body does not automatically return a value. Use a return statement for that.

For return value, what's the difference between arrow function and normal function?

  1. this
    > There is one subtle difference in behavior between ordinary function functions and arrow functions. Arrow functions do not have their own this value. The value of this inside an arrow function is always inherited from the enclosing scope.

There are four type functions: a. declare function b. function expression c. arrow function d. module.
Right?

Actually, we can map function to scope. So we need add flags to difference arrow/module from the others, such as. (even use flags (bits of argument_closures, arrow, module)).

struct njs_parser_scope_s {
    ...
    njs_scope_t                     type:8;
    uint8_t                         nesting;     /* 4 bits */
    uint8_t                         argument_closures;
    unit8_t                         arrow;
    unit8_t                         module;
};
case NJS_TOKEN_THIS:
        nxt_thread_log_debug("JS: this");

        scope = parser->scope;

        while (scope->type != NJS_SCOPE_GLOBAL) {
            /* TODO flags & arrow, flags & module */
            if (scope->type == NJS_SCOPE_FUNCTION) {
                node->index = NJS_INDEX_THIS;
                break;
            }

            scope = scope->parent;
        }

        if (node->index == NJS_INDEX_THIS) {
            break;
        }

        node->token = NJS_TOKEN_GLOBAL_THIS;

What do you think of this?

  1. arguments
    > There’s one more minor difference between arrow and non-arrow functions: arrow functions don’t get their own arguments object, either. Of course, in ES6, you’d probably rather use a rest parameter or default value anyway.
    > Yes, they came from an outer scope, and also this can't be changed in runtime by using call, apply, bind on an arrow function.

The same as this?

@xeioex Take a look, please.

Refactored njs_parser_function_lambda().
https://gist.github.com/hongzhidao/75b0ac9e0a7bc0026fb1e1e40679f5a7

Consider this patch before introducing module and arrow-function patches.

@hongzhidao

Note that an arrow function with a block body does not automatically return a value. Use a return statement for that.

For return value, what's the difference between arrow function and normal function?

From this

(param1, param2, …, paramN) => expression
// equivalent to: => { return expression; }

so

// should work
>> materials.map(material => material.length)
SyntaxError: Unexpected token "material" in shell:1
// is correct behaviour
>> materials.map(material => {material.length})
[
 undefined,
 undefined
]

  1. >Actually, we can map function to scope. So we need add flags to difference arrow/module from the others, such as. (even use flags (bits of argument_closures, arrow, module)).

Agree.


  1. > /* TODO flags & arrow, flags & module */
    > What do you think of this?

I am not sure yet, maybe it is better to handle in njs_function_lambda_call()?

4.

The same as this?

Yes.

@xeioex

See the draft of arrow function first for better decide whether refactor njs_parser_function_lambda().

Refactored njs_parser_function_lambda().
https://gist.github.com/hongzhidao/75b0ac9e0a7bc0026fb1e1e40679f5a7

arrow function.
https://gist.github.com/hongzhidao/f4bf47ad8af2d878d3217eba96549a3c

What I'm not sure yet is the index of this, and arguments. I need deep into this, arguments, apply, call. It's not familiar enough for me now.
@drsm welcome to add tests for these cases :)

(param1, param2, …, paramN) => expression
// equivalent to: => { return expression; }

No problem, I get it.

@hongzhidao

See the draft of arrow function first for better decide whether refactor njs_parser_function_lambda().

Agree, looks useful. Will be committed.

@xeioex

@hongzhidao

See the draft of arrow function first for better decide whether refactor njs_parser_function_lambda().

Agree, looks useful. Will be committed.

  1. njs_parser_lambda_arguments
static njs_token_t
njs_parser_lambda_arguments(njs_vm_t *vm, njs_parser_t *parser,
    njs_function_lambda_t *lambda, njs_index_t index, njs_token_t token)

Please look carefully with argument index, maybe it's affected by index, arguments.

  1. njs_parser_lambda_argument
    Is it clear enough, is it better to add while introducing arrow function?

  2. Unused line
    https://gist.github.com/hongzhidao/75b0ac9e0a7bc0026fb1e1e40679f5a7#file-function-lambda-patch-L131

  3. What about njs_parser_lambda_statement? (njs_parser_lambda_statements)

  4. Help rework module patch.

@hongzhidao

please, share the updated hg changeset.

njs_parser_lambda_arguments
njs_parser_lambda_argument

looks good.

What about njs_parser_lambda_statement? (njs_parser_lambda_statements)

BTW, statement is not a good word here, I think. Statement should refer only to a complete statement, not a part of it.

Maybe we should join njs_parser_lambda_statements() + njs_parser_lambda_body() => njs_parser_lambda_body()?
Can they be used independently?

Help rework module patch.

Will do.

Maybe we should join njs_parser_lambda_statements() + njs_parser_lambda_body() => njs_parser_lambda_body()?
Can they be used independently?

In njs_parser_function_lambda.
njs_parser_lambda_statements() + njs_parser_lambda_body()

In njs_parser_module.
njs_parser_lambda_statements()

In njs_parser_arrow_expression.
njs_parser_lambda_statements() + njs_parser_lambda_body()

So, what about njs_parser_lambda_body(return_flag), such as

njs_parser_lambda_body(return) {
     njs_parser_lambda_statements();
     if (return) {
         njs_parser_lambda_return();
     }
}

@hongzhidao

So, what about njs_parser_lambda_body(return_flag)?

Looks good.

@xeioex

Make njs_parser_lambda_body() public, even macro?
(I think macro is not good.)

@hongzhidao

Make njs_parser_lambda_body() public, even macro?

agree, inline function?

@xeioex

Updated function lambda patch.
https://gist.github.com/hongzhidao/75b0ac9e0a7bc0026fb1e1e40679f5a7

But it's better to release the patch after arrow-function and module patches.
We try it first. I'm on arrow function. You try this in module patch.

arrow function patch
https://gist.github.com/hongzhidao/f4bf47ad8af2d878d3217eba96549a3c

Help rework module patch.
Will do.

I make njs_parser_lambda_statements() public.

Make njs_parser_lambda_body() public, even macro?

It's not clear.

@xeioex

I think these patches need to be splited first.

  1. lambda function
  2. njs_parser_reference
  3. add newly parameters.
    njs_lexer_token(vm, parser), njs_parser_token(vm, parser), njs_lexer_peek_token(vm, parser, offset)
  4. njs_parser_return_default_set

@xeioex

Another improvement.
https://gist.github.com/hongzhidao/f4bf47ad8af2d878d3217eba96549a3c#file-arrow-function-03-10-patch-L436
https://gist.github.com/hongzhidao/f4bf47ad8af2d878d3217eba96549a3c#file-arrow-function-03-10-patch-L398

Support single expression in arrow function. Better naming with njs_parser_return_default_set is welcome.

static njs_parser_node_t *
njs_parser_return_default_set(njs_vm_t *vm, njs_parser_t *parser,
   njs_parser_node_t *expr)
{
    njs_parser_node_t  *node, *stmt;

    node = njs_parser_node_new(vm, parser, NJS_TOKEN_RETURN);
    if (nxt_slow_path(node == NULL)) {
        return NULL;
    }

    node->right = expr;

    stmt = njs_parser_node_new(vm, parser, NJS_TOKEN_STATEMENT);
    if (nxt_slow_path(stmt == NULL)) {
        return NULL;
    }

    stmt->left = njs_parser_chain_top(parser);
    stmt->right = node;

    njs_parser_chain_top_set(parser, stmt);

    return stmt;
}

@hongzhidao

Make njs_parser_lambda_body() public, even macro

So what about return flag? Is it necessary?

@hongzhidao

Make njs_parser_lambda_body() public, even macro

So what about return flag? Is it necessary?

Unnecessary.

The order of patches I suggest.

  1. njs_parser_reference
  2. njs_parser_return_default_set
  3. lambda function

Are you on the work now?

@hongzhidao

The order of patches I suggest.
njs_parser_reference
njs_parser_return_default_set
lambda function
Are you on the work now?

OK. Will do. But, I am periodically interrupted, it can take some time. I plan to commit it today.

@hongzhidao

OK. Will do. But, I am periodically interrupted, it can take some time. I plan to commit it today.
https://gist.github.com/xeioex/90326b661bf7a9f576b1a49c08096bed

Take a look.

  1. njs_parser_reference
  2. njs_parser_return_set + lambda_function patch

I plan to commit it, today.

@hongzhidao

OK. Will do. But, I am periodically interrupted, it can take some time. I plan to commit it today.
https://gist.github.com/xeioex/90326b661bf7a9f576b1a49c08096bed

Take a look.

  1. njs_parser_reference
  2. njs_parser_return_set + lambda_function patch

I plan to commit it, today.

OK, no problem.

@xeioex

I think the next work is improving njs_lexer_peek_token.(even wrap into njs_parser_peek).

@xeioex

Improved njs_lexer_peek_token().
https://gist.github.com/hongzhidao/89a5028056d03599cd2cda24c59046b4

What about this?

As far, we've introduced LL(k) :)

@xeioex

// create a new empty object for each puppy to play with
var chewToys = puppies.map(puppy => {}); // BUG!
var chewToys = puppies.map(puppy => ({})); // ok
Unfortunately, an empty object {} and an empty block {} look exactly the same. The rule in ES6 is that { immediately following an arrow is always treated as the start of a block, never the start of an object. The code puppy => {} is therefore silently interpreted as an arrow function that does nothing and returns undefined.

I can't get it.

  1. Are the following tests wrong?
    { nxt_string("var f = ()=>{}; f()"),
      nxt_string("undefined") },

    { nxt_string("var f = ()=>({}); f()"),
      nxt_string("[object Object]") },
  1. Arrow function patch (this, arguments, parameter-rest unsupported.), check parser first.
    https://gist.github.com/hongzhidao/f4bf47ad8af2d878d3217eba96549a3c

  2. Is this right?

./build/njs
interactive njs 0.3.0

v.<Tab> -> the properties and prototype methods of v.
type console.help() for more information

>>
>> var materials = [ 'adas', 'wqwqwqwqw']
undefined
>> materials.map(material => material.length)
[
 4,
 9
]
>> materials.map(material => {material.length})
[
 undefined,
 undefined
]
>> materials.map(material => {return material.length})
[
 4,
 9
]
>>

@hongzhidao

Are the following tests wrong?

No

Is this right?

yes. You can check it in your browser console. Or, for example, here: https://repl.it/languages/javascript

@drsm @xeioex

arrow-function patch. (parser done)
https://gist.github.com/hongzhidao/f4bf47ad8af2d878d3217eba96549a3c
Welcome to add more tests.

The next step is supporting this, arguments, rest parameter.

@hongzhidao

https://gist.github.com/hongzhidao/89a5028056d03599cd2cda24c59046b4

Looks good.

Minor improvements:
https://gist.github.com/xeioex/20771306a7e568953247aa145e51fd3b

  1. Commit log made more specific.
  2. A commentary about NJS_TOKEN_DIVISION in njs_lexer_peek_token().

@hongzhidao

https://gist.github.com/hongzhidao/89a5028056d03599cd2cda24c59046b4

Looks good.

Minor improvements:
https://gist.github.com/xeioex/20771306a7e568953247aa145e51fd3b

  1. Commit log made more specific.
  2. A commentary about NJS_TOKEN_DIVISION in njs_lexer_peek_token().

Looks better, thanks.

@xeioex

  1. njs_parser_token().
    Make this inline?

  2. njs_parser_peek_token()
    It's better to hide lexer in njs_parser.c.

  3. Refactor out a macro. (used in several places)

if (token == NJS_TOKEN_ARGUMENTS || token ==  NJS_TOKEN_EVAL)
diff -r 4e82cb630c69 njs/njs_parser.c
--- a/njs/njs_parser.c  Sun Mar 10 22:25:59 2019 +0800
+++ b/njs/njs_parser.c  Mon Mar 11 00:02:57 2019 +0800
@@ -375,7 +375,7 @@ njs_parser_statement(njs_vm_t *vm, njs_p
     default:

         if (token == NJS_TOKEN_NAME
-            && njs_lexer_peek_token(vm, parser->lexer, 0) == NJS_TOKEN_COLON)
+            && njs_parser_peek_token(vm, parser, 0) == NJS_TOKEN_COLON)
         {
             return njs_parser_labelled_statement(vm, parser);
         }
@@ -1929,24 +1929,6 @@ njs_parser_property_token(njs_vm_t *vm,


 njs_token_t
-njs_parser_token(njs_vm_t *vm, njs_parser_t *parser)
-{
-    njs_token_t  token;
-
-    do {
-        token = njs_lexer_token(vm, parser->lexer);
-
-        if (nxt_slow_path(token <= NJS_TOKEN_ILLEGAL)) {
-            return token;
-        }
-
-    } while (nxt_slow_path(token == NJS_TOKEN_LINE_END));
-
-    return token;
-}
-
-
-njs_token_t
 njs_parser_terminal(njs_vm_t *vm, njs_parser_t *parser, njs_token_t token)
 {
     double             num;
diff -r 4e82cb630c69 njs/njs_parser.h
--- a/njs/njs_parser.h  Sun Mar 10 22:25:59 2019 +0800
+++ b/njs/njs_parser.h  Mon Mar 11 00:02:57 2019 +0800
@@ -89,10 +89,29 @@ njs_token_t njs_parser_assignment_expres
 njs_token_t njs_parser_terminal(njs_vm_t *vm, njs_parser_t *parser,
     njs_token_t token);
 njs_token_t njs_parser_property_token(njs_vm_t *vm, njs_parser_t *parser);
-njs_token_t njs_parser_token(njs_vm_t *vm, njs_parser_t *parser);
 njs_token_t njs_parser_lambda_statements(njs_vm_t *vm, njs_parser_t *parser,
     njs_token_t token);

+nxt_inline njs_token_t
+njs_parser_token(njs_vm_t *vm, njs_parser_t *parser)
+{
+    njs_token_t  token;
+
+    do {
+        token = njs_lexer_token(vm, parser->lexer);
+
+        if (nxt_slow_path(token <= NJS_TOKEN_ILLEGAL)) {
+            return token;
+        }
+
+    } while (nxt_slow_path(token == NJS_TOKEN_LINE_END));
+
+    return token;
+}
+
+#define njs_parser_peek_token(vm, parser, offset)                   \
+    njs_lexer_peek_token(vm, (parser)->lexer, offset)
+
 njs_variable_t *njs_variable_resolve(njs_vm_t *vm, njs_parser_node_t *node);
 njs_index_t njs_variable_typeof(njs_vm_t *vm, njs_parser_node_t *node);
 njs_index_t njs_variable_index(njs_vm_t *vm, njs_parser_node_t *node);

@hongzhidao

BTW, found an issue on the current tip.

./build/njs
interactive njs 0.3.0

v.<Tab> -> the properties and prototype methods of v.
type console.help() for more information

>>  var x = 2 ;
=================================================================
==8190==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6030000018b1 at pc 0x0000004461c9 bp 0x7ffe804ebd30 sp 0x7ffe804eb4e0
READ of size 2030 at 0x6030000018b1 thread T0
    #0 0x4461c8 in __interceptor_memcpy.part.38 (/home/xeioex/workspace/nginx/nginScript/njs/build/njs+0x4461c8)
    #1 0x60b72e in nxt_vsprintf /home/xeioex/workspace/nginx/nginScript/njs/nxt/nxt_sprintf.c:429
    #2 0x5d6afd in njs_parser_scope_error /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:2891:9
    #3 0x5d52d7 in njs_parser_lexer_error /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:2921:5
    #4 0x5d74fa in njs_parser_unexpected_token /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:2826:9
    #5 0x5d611c in njs_parser_reference /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:2304:16
    #6 0x5d11a4 in njs_parser_terminal /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:2102:16
    #7 0x5e7647 in njs_parser_call_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:794:17
    #8 0x5e6e95 in njs_parser_post_inc_dec_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:737:13
    #9 0x5e68b2 in njs_parser_inc_dec_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:698:16
    #10 0x5e5c63 in njs_parser_unary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:602:16
    #11 0x5e5661 in njs_parser_exponential_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:525:13
    #12 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #13 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #14 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #15 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #16 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #17 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #18 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #19 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #20 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #21 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #22 0x5e48df in njs_parser_conditional_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:401:13
    #23 0x5e3fe5 in njs_parser_assignment_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:277:13
    #24 0x5e5626 in njs_parser_any_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:266:12
    #25 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #26 0x5e312c in njs_parser_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:204:12
    #27 0x5d7308 in njs_parser_statement /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:402:21
    #28 0x5ce949 in njs_parser_statement_chain /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:301:13
    #29 0x5cd730 in njs_parser /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:155:17
    #30 0x53261b in njs_vm_compile /home/xeioex/workspace/nginx/nginScript/njs/njs/njs.c:250:11
    #31 0x52da8f in njs_process_script /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_shell.c:611:11
    #32 0x52b7be in njs_interactive_shell /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_shell.c:417:9
    #33 0x52ac2c in main /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_shell.c:243:15
    #34 0x7f3f65e5882f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/../csu/libc-start.c:291
    #35 0x41b648 in _start (/home/xeioex/workspace/nginx/nginScript/njs/build/njs+0x41b648)

0x6030000018b1 is located 0 bytes to the right of 17-byte region [0x6030000018a0,0x6030000018b1)
allocated by thread T0 here:
    #0 0x4440d8 in __strdup (/home/xeioex/workspace/nginx/nginScript/njs/build/njs+0x4440d8)
    #1 0x7f3f66a5e624 in readline (/usr/lib/x86_64-linux-gnu/libedit.so.2+0x1d624)
    #2 0x52ac2c in main /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_shell.c:243:15
    #3 0x7f3f65e5882f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/../csu/libc-start.c:291

SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/xeioex/workspace/nginx/nginScript/njs/build/njs+0x4461c8) in __interceptor_memcpy.part.38
Shadow bytes around the buggy address:
  0x0c067fff82c0: fd fd fd fd fa fa 00 00 00 00 fa fa fd fd fd fd
  0x0c067fff82d0: fa fa 00 00 00 00 fa fa 00 00 00 00 fa fa fd fd
  0x0c067fff82e0: fd fd fa fa 00 00 00 00 fa fa fd fd fd fd fa fa
  0x0c067fff82f0: 00 00 00 00 fa fa 00 00 00 00 fa fa 00 00 00 00
  0x0c067fff8300: fa fa fd fd fd fd fa fa 00 00 00 00 fa fa fd fd
=>0x0c067fff8310: fd fa fa fa 00 00[01]fa fa fa 00 00 01 fa fa fa
  0x0c067fff8320: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8330: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8340: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8350: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8360: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==8190==ABORTING

@hongzhidao

BTW, found an issue on the current tip.

./build/njs
interactive njs 0.3.0

v.<Tab> -> the properties and prototype methods of v.
type console.help() for more information

>>  var x = 2 ;
=================================================================
==8190==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6030000018b1 at pc 0x0000004461c9 bp 0x7ffe804ebd30 sp 0x7ffe804eb4e0
READ of size 2030 at 0x6030000018b1 thread T0
    #0 0x4461c8 in __interceptor_memcpy.part.38 (/home/xeioex/workspace/nginx/nginScript/njs/build/njs+0x4461c8)
    #1 0x60b72e in nxt_vsprintf /home/xeioex/workspace/nginx/nginScript/njs/nxt/nxt_sprintf.c:429
    #2 0x5d6afd in njs_parser_scope_error /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:2891:9
    #3 0x5d52d7 in njs_parser_lexer_error /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:2921:5
    #4 0x5d74fa in njs_parser_unexpected_token /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:2826:9
    #5 0x5d611c in njs_parser_reference /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:2304:16
    #6 0x5d11a4 in njs_parser_terminal /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:2102:16
    #7 0x5e7647 in njs_parser_call_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:794:17
    #8 0x5e6e95 in njs_parser_post_inc_dec_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:737:13
    #9 0x5e68b2 in njs_parser_inc_dec_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:698:16
    #10 0x5e5c63 in njs_parser_unary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:602:16
    #11 0x5e5661 in njs_parser_exponential_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:525:13
    #12 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #13 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #14 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #15 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #16 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #17 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #18 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #19 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #20 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #21 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #22 0x5e48df in njs_parser_conditional_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:401:13
    #23 0x5e3fe5 in njs_parser_assignment_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:277:13
    #24 0x5e5626 in njs_parser_any_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:266:12
    #25 0x5e31e5 in njs_parser_binary_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:470:13
    #26 0x5e312c in njs_parser_expression /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser_expression.c:204:12
    #27 0x5d7308 in njs_parser_statement /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:402:21
    #28 0x5ce949 in njs_parser_statement_chain /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:301:13
    #29 0x5cd730 in njs_parser /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_parser.c:155:17
    #30 0x53261b in njs_vm_compile /home/xeioex/workspace/nginx/nginScript/njs/njs/njs.c:250:11
    #31 0x52da8f in njs_process_script /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_shell.c:611:11
    #32 0x52b7be in njs_interactive_shell /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_shell.c:417:9
    #33 0x52ac2c in main /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_shell.c:243:15
    #34 0x7f3f65e5882f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/../csu/libc-start.c:291
    #35 0x41b648 in _start (/home/xeioex/workspace/nginx/nginScript/njs/build/njs+0x41b648)

0x6030000018b1 is located 0 bytes to the right of 17-byte region [0x6030000018a0,0x6030000018b1)
allocated by thread T0 here:
    #0 0x4440d8 in __strdup (/home/xeioex/workspace/nginx/nginScript/njs/build/njs+0x4440d8)
    #1 0x7f3f66a5e624 in readline (/usr/lib/x86_64-linux-gnu/libedit.so.2+0x1d624)
    #2 0x52ac2c in main /home/xeioex/workspace/nginx/nginScript/njs/njs/njs_shell.c:243:15
    #3 0x7f3f65e5882f in __libc_start_main /build/glibc-Cl5G7W/glibc-2.23/csu/../csu/libc-start.c:291

SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/xeioex/workspace/nginx/nginScript/njs/build/njs+0x4461c8) in __interceptor_memcpy.part.38
Shadow bytes around the buggy address:
  0x0c067fff82c0: fd fd fd fd fa fa 00 00 00 00 fa fa fd fd fd fd
  0x0c067fff82d0: fa fa 00 00 00 00 fa fa 00 00 00 00 fa fa fd fd
  0x0c067fff82e0: fd fd fa fa 00 00 00 00 fa fa fd fd fd fd fa fa
  0x0c067fff82f0: 00 00 00 00 fa fa 00 00 00 00 fa fa 00 00 00 00
  0x0c067fff8300: fa fa fd fd fd fd fa fa 00 00 00 00 fa fa fd fd
=>0x0c067fff8310: fd fa fa fa 00 00[01]fa fa fa 00 00 01 fa fa fa
  0x0c067fff8320: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8330: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8340: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8350: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8360: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==8190==ABORTING

Get it.

 var x = 2 ;

@xeioex

Can you try it again, I download the latest code, but it works well, be sure run make clean first.

@hongzhidao

Can you try it again, I download the latest code, but it works well, be sure run make clean first.

Yes.

./build/njs
interactive njs 0.3.0

v.<Tab> -> the properties and prototype methods of v.
type console.help() for more information

>> @
=================================================================
==25317==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000003b3 at pc 0x0000004461c9 bp 0x7ffdb7e4e950 sp 0x7ffdb7e4e100
READ of size 2030 at 0x6020000003b3 thread T0
    #0 0x4461c8 in __interceptor_memcpy.part.38 (/home/xeioex/workspace/nginx/nginScript/njs/build/njs+0x4461c8)
    #1 0x60b72e in nxt_vsprintf /home/xeioex/workspace/nginx/nginScript/njs/nxt/nxt_sprintf.c:429
...

Make sure you enable address-sanitiser
I use the following command:
make clean && CFLAGS='-O0 -DNXT_DEBUG_MEMORY=1 -fsanitize=address' ./configure && make test

@xeioex

It works well.

# This file is auto-generated by configure

NXT_CC =        cc
NXT_CFLAGS =     -pipe -fPIC -fvisibility=hidden -O -W -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -fstrict-aliasing -Wstrict-overflow=5 -Wmissing-prototypes -Werror -g -O0
-DNXT_DEBUG_MEMORY=1 -fsanitize=address

How about try the version before today?

@xeioex

I reproduce it. I think the issue is introduced by LL(k) patch. On it.

@hongzhidao

Was introduced by https://hg.nginx.org/njs/rev/8e2cb4da5e46

@hongzhidao

Was introduced by https://hg.nginx.org/njs/rev/8e2cb4da5e46

I found the reason, confirming.

@xeioex

See it first.

diff -r 4e82cb630c69 njs/njs_lexer.c
--- a/njs/njs_lexer.c   Sun Mar 10 22:25:59 2019 +0800
+++ b/njs/njs_lexer.c   Mon Mar 11 01:44:04 2019 +0800
@@ -558,6 +558,7 @@ njs_lexer_next_token(njs_lexer_t *lexer,

         case NJS_TOKEN_ILLEGAL:
         default:
+            lt->text.length = lexer->start - lt->text.start;
             lexer->start--;
             return token;

@xeioex

I think it's an early issue, and we does introducing another issue in LL(k).

diff -r 4e82cb630c69 njs/njs_lexer.c
--- a/njs/njs_lexer.c   Sun Mar 10 22:25:59 2019 +0800
+++ b/njs/njs_lexer.c   Mon Mar 11 01:50:59 2019 +0800
@@ -371,7 +371,7 @@ njs_lexer_token_push(njs_vm_t *vm, njs_l
 {
     njs_lexer_token_t  *lt;

-    lt = nxt_mp_alloc(vm->mem_pool, sizeof(njs_lexer_token_t));
+    lt = nxt_mp_zalloc(vm->mem_pool, sizeof(njs_lexer_token_t));
     if (nxt_slow_path(lt == NULL)) {
         return NULL;
     }
@@ -558,6 +558,7 @@ njs_lexer_next_token(njs_lexer_t *lexer,

         case NJS_TOKEN_ILLEGAL:
         default:
+            lt->text.length = lexer->start - lt->text.start;
             lexer->start--;
             return token;
         }
diff -r 4e82cb630c69 njs/test/njs_unit_test.c
--- a/njs/test/njs_unit_test.c  Sun Mar 10 22:25:59 2019 +0800
+++ b/njs/test/njs_unit_test.c  Mon Mar 11 01:50:59 2019 +0800
@@ -22,6 +22,9 @@ typedef struct {

 static njs_unit_test_t  njs_test[] =
 {
+    { nxt_string("@"),
+      nxt_string("SyntaxError: Unexpected token \"@\" in 1") },
+
     { nxt_string("}"),
       nxt_string("SyntaxError: Unexpected token \"}\" in 1") },

BTW.

+            lt->text.length = lexer->start - lt->text.start;
             lexer->start--;  /* Is this line redundant? */

If yes, we can make it simplier. I prefer to make the single character generic.

diff -r 4e82cb630c69 njs/njs_lexer.c
--- a/njs/njs_lexer.c   Sun Mar 10 22:25:59 2019 +0800
+++ b/njs/njs_lexer.c   Mon Mar 11 01:55:29 2019 +0800
@@ -542,24 +542,9 @@ njs_lexer_next_token(njs_lexer_t *lexer,

             /* Fall through. */

-        case NJS_TOKEN_BITWISE_NOT:
-        case NJS_TOKEN_OPEN_PARENTHESIS:
-        case NJS_TOKEN_CLOSE_PARENTHESIS:
-        case NJS_TOKEN_OPEN_BRACKET:
-        case NJS_TOKEN_CLOSE_BRACKET:
-        case NJS_TOKEN_OPEN_BRACE:
-        case NJS_TOKEN_CLOSE_BRACE:
-        case NJS_TOKEN_COMMA:
-        case NJS_TOKEN_COLON:
-        case NJS_TOKEN_SEMICOLON:
-        case NJS_TOKEN_CONDITIONAL:
+        default:
             lt->text.length = lexer->start - lt->text.start;
             return token;
-
-        case NJS_TOKEN_ILLEGAL:
-        default:
-            lexer->start--;
-            return token;
         }

@xeioex

My suggestion.

  1. Fix '@' issue.

  2. Make njs_parser_token() inline.

  3. Introduce njs_parser_peek_token().

-. Keep this unchanged since arguements can be assigned, it seems an issue.

if (token == NJS_TOKEN_ARGUMENTS || token ==  NJS_TOKEN_EVAL)

@hongzhidao

Welcome to add more tests.
The next step is supporting this, arguments, rest parameter.

Also, an arrow function is not a constructor:

> var f = () => {};
undefined
> new f()
TypeError: f is not a constructor

@xeioex

I'm wondering whether NJS only support strict mode? We don't need to consider non-strict mode in the future.

  1. In non-strict mode, arguments can be reassigned.
    https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/arguments

  2. Now we treat arguments as a special token. Alternative implementation. I deep into this.
    The arguments object is a local variable available within all non-arrow functions.

@xeioex

Take a look. Is this better?

diff -r 0e8ab0dfd08b njs/njs_generator.c
--- a/njs/njs_generator.c   Sun Mar 10 22:26:25 2019 +0300
+++ b/njs/njs_generator.c   Mon Mar 11 14:14:57 2019 +0800
@@ -579,8 +579,6 @@ njs_generate_arguments_object(njs_vm_t *
 {
     njs_vmcode_arguments_t  *gen;

-    generator->arguments_object = 1;
-
     node->index = njs_generate_object_dest_index(vm, generator, node);
     if (nxt_slow_path(node->index == NJS_INDEX_ERROR)) {
         return NXT_ERROR;
@@ -2312,7 +2310,7 @@ njs_generate_function_scope(njs_vm_t *vm
         lambda->closure_size = size;

         lambda->nesting = node->scope->nesting;
-        lambda->arguments_object = generator.arguments_object;
+        lambda->arguments_object = node->scope->arguments_object;

         lambda->start = generator.code_start;
         lambda->local_size = generator.scope_size;
diff -r 0e8ab0dfd08b njs/njs_generator.h
--- a/njs/njs_generator.h   Sun Mar 10 22:26:25 2019 +0300
+++ b/njs/njs_generator.h   Mon Mar 11 14:14:57 2019 +0800
@@ -24,7 +24,6 @@ struct njs_generator_s {

     /* Parsing Function() or eval(). */
     uint8_t                         runtime;           /* 1 bit */
-    uint8_t                         arguments_object;  /* 1 bit */
 };


diff -r 0e8ab0dfd08b njs/njs_parser.c
--- a/njs/njs_parser.c  Sun Mar 10 22:26:25 2019 +0300
+++ b/njs/njs_parser.c  Mon Mar 11 14:14:57 2019 +0800
@@ -2169,6 +2169,8 @@ njs_parser_reference(njs_vm_t *vm, njs_p
             return NULL;
         }

+        parser->scope->arguments_object = 1;
+
         break;

     case NJS_TOKEN_OBJECT_CONSTRUCTOR:
diff -r 0e8ab0dfd08b njs/njs_parser.h
--- a/njs/njs_parser.h  Sun Mar 10 22:26:25 2019 +0300
+++ b/njs/njs_parser.h  Mon Mar 11 14:14:57 2019 +0800
@@ -30,6 +30,7 @@ struct njs_parser_scope_s {
     njs_scope_t                     type:8;
     uint8_t                         nesting;     /* 4 bits */
     uint8_t                         argument_closures;
+    uint8_t                         arguments_object;
 };

@hongzhidao

I'm wondering whether NJS only support strict mode? We don't need to consider non-strict mode in the future.

yes, we target only strict mode, see http://nginx.org/en/docs/njs/index.html

@hongzhidao

Take a look. Is this better?

Looks better.

@xeioex

Now we treat arguments as a special token. Alternative implementation. I deep into this.
The arguments object is a local variable available within all non-arrow functions.

See demo first.

function foo(a, b, c) {
    var args = arguments;
    return () => {
        return arguments;
    }
}

var f = foo(1, 2, 3)  // arguments is bounded in current frame1->native.
f('a', 'b'); // frame2 can't get the frame1 information, right?

It seems we need regard arguments as a local variable like following.

function foo(a, b, c) {
    var args = arguments;
    return () => {
        return args;
    }
}

var f = foo(1, 2, 3)
f('a', 'b');

@xeioex

Make variable register easier.

# HG changeset patch
# User hongzhidao <[email protected]>
# Date 1552498622 -28800
# Node ID 1c02d9bcd9336004434d31477847f97201522a82
# Parent  dcc7965410bd168099dbbc8299ef5094e7e47445
Introduced njs_parser_local_variable.

diff -r dcc7965410bd -r 1c02d9bcd933 njs/njs_parser.c
--- a/njs/njs_parser.c  Tue Mar 12 19:28:11 2019 +0300
+++ b/njs/njs_parser.c  Thu Mar 14 01:37:02 2019 +0800
@@ -519,6 +519,7 @@ njs_parser_variable_add(njs_vm_t *vm, nj
                             njs_parser_key_hash(parser), type);
 }

+
 nxt_inline njs_ret_t
 njs_parser_variable_reference(njs_vm_t *vm, njs_parser_t *parser,
     njs_parser_node_t *node, njs_reference_type_t type)
@@ -529,6 +530,33 @@ njs_parser_variable_reference(njs_vm_t *
 }


+nxt_inline njs_parser_node_t *
+njs_parser_local_variable(njs_vm_t *vm, njs_parser_t *parser,
+    njs_variable_type_t type)
+{
+    nxt_int_t          ret;
+    njs_variable_t     *var;
+    njs_parser_node_t  *node;
+
+    var = njs_parser_variable_add(vm, parser, type);
+    if (nxt_slow_path(var == NULL)) {
+        return NULL;
+    }
+
+    node = njs_parser_node_new(vm, parser, NJS_TOKEN_NAME);
+    if (nxt_slow_path(node == NULL)) {
+        return NULL;
+    }
+
+    ret = njs_parser_variable_reference(vm, parser, node, NJS_DECLARATION);
+    if (nxt_slow_path(ret != NXT_OK)) {
+        return NULL;
+    }
+
+    return node;
+}
+
+
 static njs_token_t
 njs_parser_labelled_statement(njs_vm_t *vm, njs_parser_t *parser)
 {
@@ -1004,9 +1032,7 @@ njs_parser_return_statement(njs_vm_t *vm
 static njs_token_t
 njs_parser_var_statement(njs_vm_t *vm, njs_parser_t *parser)
 {
-    njs_ret_t          ret;
     njs_token_t        token;
-    njs_variable_t     *var;
     njs_parser_node_t  *left, *stmt, *name, *assign, *expr;

     parser->node = NULL;
@@ -1028,21 +1054,11 @@ njs_parser_var_statement(njs_vm_t *vm, n
             return NJS_TOKEN_ILLEGAL;
         }

-        var = njs_parser_variable_add(vm, parser, NJS_VARIABLE_VAR);
-        if (nxt_slow_path(var == NULL)) {
-            return NJS_TOKEN_ERROR;
-        }
-
-        name = njs_parser_node_new(vm, parser, NJS_TOKEN_NAME);
+        name = njs_parser_local_variable(vm, parser, NJS_VARIABLE_VAR);
         if (nxt_slow_path(name == NULL)) {
             return NJS_TOKEN_ERROR;
         }

-        ret = njs_parser_variable_reference(vm, parser, name, NJS_DECLARATION);
-        if (nxt_slow_path(ret != NXT_OK)) {
-            return NJS_TOKEN_ERROR;
-        }
-
         token = njs_parser_token(vm, parser);
         if (nxt_slow_path(token <= NJS_TOKEN_ILLEGAL)) {
             return token;
@@ -1456,9 +1472,7 @@ njs_parser_for_statement(njs_vm_t *vm, n
 static njs_token_t
 njs_parser_for_var_statement(njs_vm_t *vm, njs_parser_t *parser)
 {
-    njs_ret_t          ret;
     njs_token_t        token;
-    njs_variable_t     *var;
     njs_parser_node_t  *left, *stmt, *name, *assign, *expr;

     parser->node = NULL;
@@ -1480,21 +1494,11 @@ njs_parser_for_var_statement(njs_vm_t *v
             return NJS_TOKEN_ILLEGAL;
         }

-        var = njs_parser_variable_add(vm, parser, NJS_VARIABLE_VAR);
-        if (nxt_slow_path(var == NULL)) {
-            return NJS_TOKEN_ERROR;
-        }
-
-        name = njs_parser_node_new(vm, parser, NJS_TOKEN_NAME);
+        name = njs_parser_local_variable(vm, parser, NJS_VARIABLE_VAR);
         if (nxt_slow_path(name == NULL)) {
             return NJS_TOKEN_ERROR;
         }

-        ret = njs_parser_variable_reference(vm, parser, name, NJS_DECLARATION);
-        if (nxt_slow_path(ret != NXT_OK)) {
-            return NJS_TOKEN_ERROR;
-        }
-
         token = njs_parser_token(vm, parser);
         if (nxt_slow_path(token <= NJS_TOKEN_ILLEGAL)) {
             return token;
@@ -1693,7 +1697,6 @@ njs_parser_try_statement(njs_vm_t *vm, n
 {
     njs_ret_t          ret;
     njs_token_t        token;
-    njs_variable_t     *var;
     njs_parser_node_t  *node, *try, *catch;

     token = njs_parser_try_block(vm, parser);
@@ -1739,21 +1742,11 @@ njs_parser_try_statement(njs_vm_t *vm, n
             return NJS_TOKEN_ERROR;
         }

-        var = njs_parser_variable_add(vm, parser, NJS_VARIABLE_CATCH);
-        if (nxt_slow_path(var == NULL)) {
-            return NJS_TOKEN_ERROR;
-        }
-
-        node = njs_parser_node_new(vm, parser, NJS_TOKEN_NAME);
+        node = njs_parser_local_variable(vm, parser, NJS_VARIABLE_CATCH);
         if (nxt_slow_path(node == NULL)) {
             return NJS_TOKEN_ERROR;
         }

-        ret = njs_parser_variable_reference(vm, parser, node, NJS_DECLARATION);
-        if (nxt_slow_path(ret != NXT_OK)) {
-            return NJS_TOKEN_ERROR;
-        }
-
         catch->left = node;

         token = njs_parser_token(vm, parser);

Don't merge it, I'll try to put it with arrow function together.

It seems we need regard arguments as a local variable like following.
njs_parser_local_variable(njs_vm_t *vm, njs_parser_t *parser, njs_variable_type_t type)

If yes, maybe we need add two parameters: name and hash.

njs_parser_local_variable(vm, parser, name, hash, type)

BTW, what about njs_parser_variable_node?

@hongzhidao

It seems we need regard arguments as a local variable like following.

We can still regard arguments as a special token. We just need to find the first proper (non-arrow) function frame here:
https://github.com/nginx/njs/blob/master/njs/njs_vm.c#L419

@xeioex

https://github.com/nginx/njs/blob/master/njs/njs_vm.c#L419

I try to regard arguments as a local variable.

  1. In non-strict mode, arguments can be reassigned.
  2. arguments may become closure. If we use a special token, I think it becomes harder.

See demo. I use newly token args.

function foo(a, b, c) {
    return () => {
        return args;
    }
}

var s = foo(1, 2, 3)('a', 'b');
console.log(s);

First draft. Still rough.

https://gist.github.com/hongzhidao/b29fcf41939a39adc72c638a3bb1fa69

@hongzhidao

I try to regard arguments as a local variable.

Agree. Looks promising.

...
if (!lambda->arrow_function) {
+        node = njs_parser_variable_node(vm, parser, &name, hash,
+                                        NJS_VARIABLE_VAR);

Do we really need to create arguments object unconditionally for all non-arrow functions?
Ideally, we only have to do it, if we have a arguments reference from inclosed arrow-functions.

name = nxt_string_value("args");

why not arguments itself?

@hongzhidao

I try to regard arguments as a local variable.

Agree. Looks promising.

...
if (!lambda->arrow_function) {
+        node = njs_parser_variable_node(vm, parser, &name, hash,
+                                        NJS_VARIABLE_VAR);

Do we really need to create arguments object unconditionally for all non-arrow functions?
Ideally, we only have to do it, if we have a arguments reference from inclosed arrow-functions.

name = nxt_string_value("args");

why not arguments itself?

Just for testing, will use arguments.

@xeioex

Do we really need to create arguments object unconditionally for all non-arrow functions?
Ideally, we only have to do it, if we have an arguments reference from inclosed arrow-functions.

OK

@xeioex

  1. Style.
    https://gist.github.com/hongzhidao/a34396f68e9f5da3e0de3350cfa9ee08

  2. If we regard arguments as a variable not a special token, I faced a problem in njs_parser_assignment_expression.

/* How to check whether the token is arguments? */
if (!njs_parser_is_lvalue(parser->node)) {
            token = parser->node->token;

            if (token == NJS_TOKEN_ARGUMENTS || token == NJS_TOKEN_EVAL) {
                njs_parser_syntax_error(vm, parser, "Identifier \"%s\" "
                                        "is forbidden as left-hand in assignment",
                                        (token == NJS_TOKEN_EVAL) ? "eval"
                                                                 : "arguments");
            } else {
                njs_parser_ref_error(vm, parser,
                                     "Invalid left-hand side in assignment");
            }

            return NJS_TOKEN_ILLEGAL;
        }

What about adding a new field token_name in njs_parser_node_t?

diff -r aa48701230f0 njs/njs_parser.h
--- a/njs/njs_parser.h  Fri Mar 15 01:15:56 2019 +0800
+++ b/njs/njs_parser.h  Fri Mar 15 01:31:46 2019 +0800
@@ -39,6 +39,7 @@ struct njs_parser_node_s {
     uint8_t                         ctor:1;
     uint8_t                         temporary;    /* 1 bit  */
     uint32_t                        token_line;
+    nxt_str_t                       token_name;

@hongzhidao

What about using NJS_TOKEN_ARGUMENTS as internal token (in parser only)? We can still have arguments as local variable.

deleted.

@hongzhidao

What about using NJS_TOKEN_ARGUMENTS as internal token (in parser only)? We can still have arguments as local variable.

  1. Take a look.
    https://gist.github.com/hongzhidao/bdeebc138b51f1cd10c33d8e36c11027

  2. Better naming for njs_generate_argument_closures.

  3. What about setting lambda->arguments_object true always?

if (lambda->arguments_object) {
        ret = njs_function_arguments_object_init(vm, &frame->native);
        if (nxt_slow_path(ret != NXT_OK)) {
            return NXT_ERROR;
        }
    }

@hongzhidao

What about setting lambda->arguments_object true always?

It looks like extra work, most of the time not needed.

BTW, arguments is mostly outdated mechanism for for args passing. Currently it is mostly used for compatibility and for test262 test suite (where it is heavy used). rest parameters syntax is ES6 variant.

In real usecases it is rarely used. So I would prefer to optimize arguments as much as possible.

test262

OK, I think we also be able to optimize arguments including arrow funciton.

{ nxt_string("function f(a, b, c){a = 4; return () => { return arguments[0]}};"
                 "f(1,2,3)('a', 'b')"),
      nxt_string("1") },

@xeioex

I plan to separate some patches from arrow function patch tomorrow, they are ready to commit.

  1. style.
  2. njs_parser_variable_node.
  3. regard arguments as variable.

Continue to do these next week. Too busy today. :)

@xeioex

  1. style.
  2. Improved arguments.
    https://gist.github.com/hongzhidao/c36fde1b0f1f6941d90a371dcc9a0e37

  3. Initial arrow function support.
    https://gist.github.com/hongzhidao/b29fcf41939a39adc72c638a3bb1fa69

The key point is njs_function_arguments_object_init.
Both the following are the same.
https://gist.github.com/hongzhidao/c36fde1b0f1f6941d90a371dcc9a0e37#file-commit-03-16-patch-L218
https://gist.github.com/hongzhidao/b29fcf41939a39adc72c638a3bb1fa69#file-arrow-function-arguments-patch-L15

@drsm welcome to test, now only support arguments.
this, rest-parameter are not supported yet.

@xeioex

There is no difference with rest-parameter in behavior between ordinary functions and arrow functions. Right?

@hongzhidao

There is no difference with rest-parameter in behavior between ordinary functions and arrow functions. Right?

Yes, according to should be the same.

there is a problem:

//node
> var f = (...c) => { return (function() { return arguments; }).bind(null, c); };
undefined
> var x = f(1,'a',false, {});
undefined
> x
[Function: bound ]
> x()
[Arguments] { '0': [ 1, 'a', false, {} ] }
> x(1,2,3)
[Arguments] { '0': [ 1, 'a', false, {} ], '1': 1, '2': 2, '3': 3 }

vs.

//njs
>> var f = (...c) => { return (function() { return arguments; }).bind(null, c); };
undefined
>> var x = f(1,'a',false, {});
undefined
>> x()
{

}
>> x(1,2,3)
{
 0: [
  1,
  'a',
  false,
  {

  }
 ],
 1: 1,
 2: 2
}
>>

also:
```js

var x = function(...c) { return () => { return [c, arguments]; }; };
undefined
x(1,2,3)()
[
[
1,
2,
3
],
{
0: [
1,
2,
3
],
1: 2,
2: 3
}
]
var z = function(...c) { var args = arguments; return function() { return [c, args]; }; };
undefined
z(1,2,3)()
[
[
1,
2,
3
],
{
0: [
1,
2,
3
],
1: 2,
2: 3
}
]

@xeioex

Patches are ready. Take a look, please.
https://gist.github.com/hongzhidao/d58327f9305f586dc2c11c5e48856bf3

  1. Fixed njs_parser_peek_token().
  2. Style.
  3. Improved njs_function_alloc().

Other patches.

  1. Improved lambda arguments. (done)
    https://gist.github.com/hongzhidao/9e0b38c9a55312fc4f30ed9b7240d8c1
  2. A reference to arrow function. (WIP)
    https://gist.github.com/hongzhidao/947f9c4d1a889fabf2a632ed0883f7f7

@drsm

Also, an arrow function is not a constructor:

Will fix based Improved njs_function_alloc(). patch.

@hongzhidao

Fixed njs_parser_peek_token().

This breaks the following test:

var a=1,b=2,c=3;                                                                                                                 
a=b                                                                                                                              
++c                                                                                                                              

if (a!==b) {throw Error('#1: Automatic semicolon insertion not work with ++')}

Grammar rules (Automatic Semicolon Insertion):

The following are the only restricted productions in the grammar:

PostfixExpression :
    LeftHandSideExpression [no LineTerminator here] ++
    LeftHandSideExpression [no LineTerminator here] --
ContinueStatement :
    continue [no LineTerminator here] Identifier ;
BreakStatement :
    break [no LineTerminator here] Identifier ;
ReturnStatement :
    return [no LineTerminator here] Expression ;
ThrowStatement :
    throw [no LineTerminator here] Expression ;

The practical effect of these restricted productions is as follows:

When a ++ or -- token is encountered where the parser would treat it as a postfix operator, and at least one LineTerminator occurred between the preceding token and the ++ or -- token, then a semicolon is automatically inserted before the ++ or -- token.

When a continue, break, return, or throw token is encountered and a LineTerminator is encountered before the next token, a semicolon is automatically inserted after the continue, break, return, or throw token.

[no LineTerminator here] - for njs, this means that in places described above, we should not use njs_parser_token() because it silently consumes newline characters. Instead njs_lexer_token() is used.

@xeioex

  1. For the code.
: 1

It works well both in nodejs and repl.it, but breaks in njs.

  1. For njs_parser_peek_token, compared to the implementation before introduced LL(k).
njs_lexer_peek_token(njs_lexer_t *lexer)
{
    u_char       *start;
    njs_token_t  token;

    start = lexer->start;

    while (start < lexer->end) {
        token = njs_tokens[*start++];

        switch (token) {
        case NJS_TOKEN_SPACE:
        case NJS_TOKEN_LINE_END:
            continue;

        default:
            return token;
        }
    }

    return NJS_TOKEN_END;
}

newline is consumed. But we changed its function after LL(k).
I think njs_parser_peek_token need to be improved.

  1. njs_lexer_rollback is also changed after LL(K).

  2. This breaks the following test:

On it.

@xeioex

Take a look again. Fixed peek token.
https://gist.github.com/hongzhidao/beda1da2c53799b88944f55dc60d1bc7

And I prefer to improve this for making prev_token correct at anytime.

diff -r 16211f089f5e njs/njs_lexer.c
--- a/njs/njs_lexer.c   Sun Mar 17 20:34:18 2019 +0800
+++ b/njs/njs_lexer.c   Sun Mar 17 20:35:59 2019 +0800
@@ -404,7 +404,7 @@ njs_lexer_token_name_resolve(njs_lexer_t

         if (lexer->property) {
             lexer->property_token = lt->token;
-            return NJS_TOKEN_NAME;
+            lt->token = NJS_TOKEN_NAME;
         }
     }

Here are the other patches.
https://gist.github.com/hongzhidao/d58327f9305f586dc2c11c5e48856bf3
Style. Improved njs_function_alloc(). Improved lambda arguments.

A reference to arrow function. (including avoiding constructor)
https://gist.github.com/hongzhidao/947f9c4d1a889fabf2a632ed0883f7f7

@hongzhidao

Style.

Committed with minor changes. Like in https://gist.github.com/hongzhidao/d58327f9305f586dc2c11c5e48856bf3#file-njs-03-17-patch-L71, (maximum text width 80 characters is violated, see)

@xeioex

I'm still not sure where to process this with arrow function.

The first try.
https://gist.github.com/hongzhidao/947f9c4d1a889fabf2a632ed0883f7f7#file-arrow-function-03-17-diff-L18

  1. Does native_frame->arguments[0] always point to this?
  2. Do native_frame->arguments[1..n] always point toarguments`?
  3. Process this of arrow function in njs_function_lambda_frame, right?

Suggestions are welcome.

Demo.

var adder = {
  base: 1,

  add: function(a) {
    var f = v => v + this.base;
    return f(a);
  },
};

console.log(adder.add(1));

Maybe we need to regard this as a local variable like arguments.
I prefer the idea of regarding this as a local variable, the same as arguments.
Will try it next week.

@xeioex take a look.

Improved lambda arguments and this objects.
https://gist.github.com/hongzhidao/e9af73580ef3e780ad26521472f326dd

My thoughts.
native_frame->arguments is the most important value. It reserves this and arguments objects in memory. And the keywords this and arguments are just the accessed way in js code.
For this, its memory position (NJS_INDEX_THIS) is calculated at first, it's handy of course.
But if we regard this as a local variable, it's more flexible in parser phase (included arrow function).

So I think both access way of special token and variable are ok, we didn't change its inner structure which is native_frame->arguments. The same is to arguments.

Other patches.
https://gist.github.com/hongzhidao/d58327f9305f586dc2c11c5e48856bf3

  1. Improved njs_function_alloc(). (ready)
  2. A reference to arrow function.

passed.

var adder = {
  base: 1,

  add: function(a) {
    var f = v => v + this.base;
    return f(a);
  },

  addThruCall: function(a) {
    var f = v => v + this.base;
    var b = {
      base: 2
    };

    return f.call(b, a);
  }
};

console.log(adder.add(1));         // This would log to 2
console.log(adder.addThruCall(1)); // This would log to 2 still

I'm on bind, apply, call.

@drsm @xeioex

Take a look.

function foo(a, b) {
  return arguments;
}

var s = foo.bind(null, 1)('a', 'b', 'c')
console.log(s);
// repl.it
[Arguments] { '0': 1, '1': 'a', '2': 'b', '3': 'c' }
// In njs
{0:1,1:'a',2:'b'}

Is it an issue in njs?

If yes, here's the patch.

diff -r 9a87bf103c47 njs/njs_function.c
--- a/njs/njs_function.c    Sun Mar 17 21:22:30 2019 +0800
+++ b/njs/njs_function.c    Mon Mar 18 15:35:41 2019 +0800
@@ -311,6 +311,8 @@ njs_function_lambda_frame(njs_vm_t *vm,
     } else {
         n = function->args_offset;

+        native_frame->nargs += (n - 1);
+
         do {
             *value++ = *bound++;
             n--;

But I prefer the patch.

diff -r 9a87bf103c47 njs/njs_function.c
--- a/njs/njs_function.c    Sun Mar 17 21:22:30 2019 +0800
+++ b/njs/njs_function.c    Mon Mar 18 15:39:08 2019 +0800
@@ -272,8 +272,8 @@ njs_function_lambda_frame(njs_vm_t *vm,
     nxt_bool_t ctor)
 {
     size_t                 size;
-    nxt_uint_t             n, max_args, closures;
-    njs_value_t            *value, *bound;
+    nxt_uint_t             i, n, max_args, closures;
+    njs_value_t            *value;
     njs_frame_t            *frame;
     njs_native_frame_t     *native_frame;
     njs_function_lambda_t  *lambda;
@@ -303,18 +303,17 @@ njs_function_lambda_frame(njs_vm_t *vm,
                              njs_frame_size(closures));
     native_frame->arguments = value;

-    bound = function->bound;
-
-    if (bound == NULL) {
+    if (function->bound == NULL) {
         *value++ = *this;

     } else {
         n = function->args_offset;

-        do {
-            *value++ = *bound++;
-            n--;
-        } while (n != 0);
+        for (i = 0; i < n; i++) {
+            *value++ = function->bound[i];
+        }
+
+        native_frame->nargs += (n - 1);
     }

     vm->scopes[NJS_SCOPE_CALLEE_ARGUMENTS] = value;

Finally, think of njs_function_native_frame toghther.

@hongzhidao

Is it an issue in njs?

Yes, it looks like this is an issue.

Improved njs_function_alloc(). (ready)

BTW, it breaks some tests in test262. On it.

var obj = {};                                                                                                                    
var props = function() {};                                                                                                       

Object.defineProperty(props, "prop", {                                                                                           
  value: {                                                                                                                       
    value: 7                                                                                                                     
  },                                                                                                                             
  enumerable: true                                                                                                               
});                                                                                                                              

Object.defineProperties(obj, props);                                                                                             

assert(obj.hasOwnProperty("prop"), 'obj.hasOwnProperty("prop") !== true');                                                       
assert.sameValue(obj.prop, 7, 'obj.prop'); 

@xeioex @drsm

Here's the final patch of arrow function.
https://gist.github.com/hongzhidao/947f9c4d1a889fabf2a632ed0883f7f7

Welcome to test.

@xeioex

Fixed native frame arguments number. (including unit tests)
https://gist.github.com/hongzhidao/e3afb2f5be94f5993bb21d7e6c5bcb14

@hongzhidao

https://gist.github.com/hongzhidao/e3afb2f5be94f5993bb21d7e6c5bcb14

Thanks for the patch, it fixes extra 13 test.

while loop -> for loop

Style and minor performance improvement: @igorsysoev prefer comparing a value with zero in a loop condition. See this (chapter: Loop termination) for rationale.

bound = function->bound;

See this (chapter: Pointer chains) for rationale.

I am not sure whether there is any difference for current processors (I guess cache and memory delays eat many times more time) for both cases.

native_frame->nargs += (n - 1)

BTW, cannot we get underflow here?

@xeioex

BTW, cannot we get underflow here?

Cannot.
If bound is not null, the args_offset is greater than 1, and since it includes this object, so we minus 1.

writing-efficient-c-and-c-code-optimization

Thanks so much, it's helpful.

@xeioex

it fixes extra 13 test.

Can you show some of them?

@xeioex

Help confirm these, they are important.

Is frame->arguments[0] this object?
Are frame->arguments[1..n] arguments objects?

@hongzhidao

Can you show some of them?

-  built-ins/Function/prototype/bind/15.3.4.5-3-1 in strict mode
-  built-ins/Function/prototype/bind/15.3.4.5.1-4-13 in strict mode
-  built-ins/Function/prototype/bind/15.3.4.5.1-4-14 in strict mode
-  built-ins/Function/prototype/bind/15.3.4.5.1-4-15 in strict mode
-  built-ins/Function/prototype/bind/15.3.4.5.1-4-7 in strict mode
-  built-ins/Function/prototype/bind/15.3.4.5.1-4-9 in strict mode
-  built-ins/Function/prototype/bind/15.3.4.5.2-4-1 in strict mode
-  built-ins/Function/prototype/bind/15.3.4.5.2-4-12 in strict mode
-  built-ins/Function/prototype/bind/15.3.4.5.2-4-13 in strict mode
-  built-ins/Function/prototype/bind/15.3.4.5.2-4-14 in strict mode
-  built-ins/Function/prototype/bind/15.3.4.5.2-4-6 in strict mode

You can see them yourself, using:
https://gist.github.com/xeioex/3cd6b4a5bd4ed856b753645b063ae282

To use, run before patch (-q is important to get meaningful diff)

LOG=test262_njs_0.3.0_01.log;  rm -fr /tmp/core.15*; python ./src/test262.py --command='<full njs path> -q' --tests=./test262/  --strict_only --summary --logname $LOG; grep Summary -A 4 $LOG

After patch

LOG=test262_njs_0.3.0_02.log;  rm -fr /tmp/core.15*; python ./src/test262.py --command='<full njs path> -q' --tests=./test262/  --strict_only --summary --logname $LOG; grep Summary -A 4 $LOG

See the difference:

colordiff -u test262_njs_0.3.0_01.log test262_njs_0.3.0_02.log

to get a test content

TEST=$(find ./test262/ | grep 'built-ins/Function/prototype/bind/15.3.4.5-21-3'); cat test262/harness/assert.js test262/harness/sta.js   $TEST > test.js

@hongzhidao

Help confirm these, they are important.
Is frame->arguments[0] this object?

Yes, zero argument is always this.

Are frame->arguments[1..n] arguments objects?

Yes.

@hongzhidao

Here's the final patch of arrow function.
https://gist.github.com/hongzhidao/947f9c4d1a889fabf2a632ed0883f7f7

Sorry, I am a bit busy lately with other patches (htttp module reported crashes, htttp module refactoring, the rest of your patches). I will take a look once I am done.

@xeioex

htttp module refactoring

Do you think it's easy to add access phase? I think it's really handy.
It's common to do rewrite url and reassign variable before the content phase.
I don't know whether we can achieve it by the way of auth_request + js_content.

@hongzhidao

I don't know whether we can achieve it by the way of auth_request + js_content.

yes, you can use auth_request_set

@xeioex

Here are the patches of arrow function.
https://gist.github.com/hongzhidao/f9f5334946286709170b42a3b19a493c

@drsm welcome to test.

@hongzhidao

Here are the patches of arrow function.
https://gist.github.com/hongzhidao/f9f5334946286709170b42a3b19a493c

On it.

@hongzhidao

Here are the patches of arrow function.
https://gist.github.com/hongzhidao/f9f5334946286709170b42a3b19a493c

On it.

BTW, I'm busy this week. So I focus on this feature this week and will continue template literal next week :)

Updated patch series, includes recent njs repo changes + small improvements.

https://gist.github.com/xeioex/1231eb0a0166ae5810c610fda7f3afde

Some found issues:

>>  var o = {f1: () => 1, f: function() {}}
SyntaxError: Unexpected token ":" in shell:1

@xeioex

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions#Use_of_prototype_property

Here's the patch.

diff -r f41e47c8ac96 njs/njs_function.c
--- a/njs/njs_function.c    Tue Mar 26 00:42:39 2019 +0800
+++ b/njs/njs_function.c    Tue Apr 02 09:46:30 2019 +0800
@@ -40,12 +40,15 @@ njs_function_alloc(njs_vm_t *vm, njs_fun
     function->args_offset = 1;
     function->u.lambda = lambda;

-    function->object.shared_hash = vm->shared->function_prototype_hash;
     function->object.__proto__ = &vm->prototypes[NJS_PROTOTYPE_FUNCTION].object;
     function->object.type = NJS_FUNCTION;
     function->object.shared = shared;
     function->object.extensible = 1;

+    if (!lambda->arrow) {
+        function->object.shared_hash = vm->shared->function_prototype_hash;
+    }
+
     if (nesting != 0 && closures != NULL) {
         function->closure = 1;

diff -r f41e47c8ac96 njs/test/njs_unit_test.c
--- a/njs/test/njs_unit_test.c  Tue Mar 26 00:42:39 2019 +0800
+++ b/njs/test/njs_unit_test.c  Tue Apr 02 09:46:30 2019 +0800
@@ -6566,6 +6566,9 @@ static njs_unit_test_t  njs_test[] =
     { nxt_string("var f = ()=>{}; f()"),
       nxt_string("undefined") },

+    { nxt_string("var f = () => 1; f.prototype"),
+      nxt_string("undefined") },
+
     { nxt_string("var f = ()=>({}); f()"),
       nxt_string("[object Object]") },

Others look good.

@hongzhidao

Here's the patch.

Yes. I noticed it too. Thanks.

https://gist.github.com/xeioex/1231eb0a0166ae5810c610fda7f3afde

BTW, I am not happy that arguments is generated even it is not used (because most of the times it is not used at all in the real code). Thinking how to do it on demand. The idea is to create arguments_object on demand in njs_vmcode_arguments instruction. njs_vmcode_arguments is generated at the moment arguments is referenced.
To handle arrow functions, njs_vmcode_arguments should scan the stack upward in search of non-arrow functions.

deleted.

@xeioex

Take a look. I did it in another way, the same as this.

diff -r f41e47c8ac96 njs/njs_function.c
--- a/njs/njs_function.c    Tue Mar 26 00:42:39 2019 +0800
+++ b/njs/njs_function.c    Tue Apr 02 23:24:17 2019 +0800
@@ -40,12 +40,15 @@ njs_function_alloc(njs_vm_t *vm, njs_fun
     function->args_offset = 1;
     function->u.lambda = lambda;

-    function->object.shared_hash = vm->shared->function_prototype_hash;
     function->object.__proto__ = &vm->prototypes[NJS_PROTOTYPE_FUNCTION].object;
     function->object.type = NJS_FUNCTION;
     function->object.shared = shared;
     function->object.extensible = 1;

+    if (!lambda->arrow) {
+        function->object.shared_hash = vm->shared->function_prototype_hash;
+    }
+
     if (nesting != 0 && closures != NULL) {
         function->closure = 1;

diff -r f41e47c8ac96 njs/njs_parser.c
--- a/njs/njs_parser.c  Tue Mar 26 00:42:39 2019 +0800
+++ b/njs/njs_parser.c  Tue Apr 02 23:24:17 2019 +0800
@@ -493,8 +493,7 @@ njs_parser_block(njs_vm_t *vm, njs_parse

 nxt_inline njs_parser_node_t *
 njs_parser_variable_node(njs_vm_t *vm, njs_parser_t *parser, nxt_str_t *name,
-    uint32_t hash, njs_variable_type_t type, uint8_t arguments_object,
-    uint8_t this_object)
+    uint32_t hash, njs_variable_type_t type)
 {
     nxt_int_t          ret;
     njs_variable_t     *var;
@@ -505,9 +504,6 @@ njs_parser_variable_node(njs_vm_t *vm, n
         return NULL;
     }

-    var->this_object = this_object;
-    var->arguments_object = arguments_object;
-
     node = njs_parser_node_new(vm, parser, NJS_TOKEN_NAME);
     if (nxt_slow_path(node == NULL)) {
         return NULL;
@@ -751,11 +747,8 @@ static njs_token_t
 njs_parser_function_lambda(njs_vm_t *vm, njs_parser_t *parser,
     njs_function_lambda_t *lambda, njs_token_t token)
 {
-    njs_ret_t          ret;
-    nxt_str_t          name;
-    uint32_t           hash;
-    njs_index_t        index;
-    njs_parser_node_t  *node;
+    njs_ret_t    ret;
+    njs_index_t  index;

     ret = njs_parser_scope_begin(vm, parser, NJS_SCOPE_FUNCTION);
     if (nxt_slow_path(ret != NXT_OK)) {
@@ -772,24 +765,6 @@ njs_parser_function_lambda(njs_vm_t *vm,
         return token;
     }

-    name = nxt_string_value("this");
-    hash = nxt_djb_hash(name.start, name.length);
-
-    node = njs_parser_variable_node(vm, parser, &name, hash,
-                                    NJS_VARIABLE_VAR, 0, 1);
-    if (nxt_slow_path(node == NULL)) {
-        return NJS_TOKEN_ERROR;
-    }
-
-    name = nxt_string_value("arguments");
-    hash = nxt_djb_hash(name.start, name.length);
-
-    node = njs_parser_variable_node(vm, parser, &name, hash,
-                                    NJS_VARIABLE_VAR, 1, 0);
-    if (nxt_slow_path(node == NULL)) {
-        return NJS_TOKEN_ERROR;
-    }
-
     token = njs_parser_lambda_body(vm, parser, token);
     if (nxt_slow_path(token <= NJS_TOKEN_ILLEGAL)) {
         return token;
@@ -1061,7 +1036,7 @@ njs_parser_var_statement(njs_vm_t *vm, n
         name = njs_parser_variable_node(vm, parser,
                                         njs_parser_text(parser),
                                         njs_parser_key_hash(parser),
-                                        NJS_VARIABLE_VAR, 0, 0);
+                                        NJS_VARIABLE_VAR);
         if (nxt_slow_path(name == NULL)) {
             return NJS_TOKEN_ERROR;
         }
@@ -1504,7 +1479,7 @@ njs_parser_for_var_statement(njs_vm_t *v
         name = njs_parser_variable_node(vm, parser,
                                         njs_parser_text(parser),
                                         njs_parser_key_hash(parser),
-                                        NJS_VARIABLE_VAR, 0, 0);
+                                        NJS_VARIABLE_VAR);
         if (nxt_slow_path(name == NULL)) {
             return NJS_TOKEN_ERROR;
         }
@@ -1755,7 +1730,7 @@ njs_parser_try_statement(njs_vm_t *vm, n
         node = njs_parser_variable_node(vm, parser,
                                         njs_parser_text(parser),
                                         njs_parser_key_hash(parser),
-                                        NJS_VARIABLE_CATCH, 0, 0);
+                                        NJS_VARIABLE_CATCH);
         if (nxt_slow_path(node == NULL)) {
             return NJS_TOKEN_ERROR;
         }
@@ -1908,7 +1883,7 @@ njs_parser_import_statement(njs_vm_t *vm
     name = njs_parser_variable_node(vm, parser,
                                     njs_parser_text(parser),
                                     njs_parser_key_hash(parser),
-                                    NJS_VARIABLE_VAR, 0, 0);
+                                    NJS_VARIABLE_VAR);
     if (nxt_slow_path(name == NULL)) {
         return NJS_TOKEN_ERROR;
     }
@@ -2279,9 +2254,9 @@ njs_token_t
 njs_parser_arrow_expression(njs_vm_t *vm, njs_parser_t *parser,
     njs_token_t token)
 {
-    njs_ret_t               ret;
-    njs_index_t             index;
-    njs_parser_node_t       *node, *body, *parent;
+    njs_ret_t              ret;
+    njs_index_t            index;
+    njs_parser_node_t      *node, *body, *parent;
     njs_function_lambda_t  *lambda;

     node = njs_parser_node_new(vm, parser, NJS_TOKEN_FUNCTION_EXPRESSION);
@@ -2306,6 +2281,8 @@ njs_parser_arrow_expression(njs_vm_t *vm
         return NJS_TOKEN_ERROR;
     }

+    parser->scope->arrow_function = 1;
+
     index = NJS_SCOPE_ARGUMENTS;

     /* A "this" reservation. */
diff -r f41e47c8ac96 njs/njs_parser.h
--- a/njs/njs_parser.h  Tue Mar 26 00:42:39 2019 +0800
+++ b/njs/njs_parser.h  Tue Apr 02 23:24:17 2019 +0800
@@ -32,6 +32,7 @@ struct njs_parser_scope_s {
     uint8_t                         nesting;     /* 4 bits */
     uint8_t                         argument_closures;
     uint8_t                         module;
+    uint8_t                         arrow_function;
 };


@@ -214,6 +215,63 @@ njs_parser_variable_reference(njs_vm_t *
 }


+nxt_inline njs_variable_t *
+njs_parser_function_variable(njs_vm_t *vm, njs_parser_t *parser,
+    nxt_str_t *name)
+{
+    uint32_t            hash;
+    njs_parser_scope_t  *scope;
+
+    scope = parser->scope;
+
+    while (scope->type == NJS_SCOPE_FUNCTION && scope->arrow_function) {
+        scope = scope->parent;
+    }
+
+    hash = nxt_djb_hash(name->start, name->length);
+
+    return njs_variable_add(vm, scope, name, hash, NJS_VARIABLE_VAR);
+}
+
+
+nxt_inline nxt_int_t
+njs_parser_variable_this(njs_vm_t *vm, njs_parser_t *parser)
+{
+    nxt_str_t       name;
+    njs_variable_t  *var;
+
+    name = nxt_string_value("this");
+
+    var = njs_parser_function_variable(vm, parser, &name);
+    if (nxt_slow_path(var == NULL)) {
+        return NXT_ERROR;
+    }
+
+    var->this_object = 1;
+
+    return NXT_OK;
+}
+
+
+nxt_inline nxt_int_t
+njs_parser_variable_arguments(njs_vm_t *vm, njs_parser_t *parser)
+{
+    nxt_str_t       name;
+    njs_variable_t  *var;
+
+    name = nxt_string_value("arguments");
+
+    var = njs_parser_function_variable(vm, parser, &name);
+    if (nxt_slow_path(var == NULL)) {
+        return NXT_ERROR;
+    }
+
+    var->arguments_object = 1;
+
+    return NXT_OK;
+}
+
+
 nxt_inline njs_parser_scope_t *
 njs_parser_global_scope(njs_vm_t *vm)
 {
diff -r f41e47c8ac96 njs/njs_parser_terminal.c
--- a/njs/njs_parser_terminal.c Tue Mar 26 00:42:39 2019 +0800
+++ b/njs/njs_parser_terminal.c Tue Apr 02 23:24:17 2019 +0800
@@ -253,6 +253,11 @@ njs_parser_reference(njs_vm_t *vm, njs_p
                 return NULL;
             }

+            ret = njs_parser_variable_this(vm, parser);
+            if (nxt_slow_path(ret != NXT_OK)) {
+                return NULL;
+            }
+
             break;
         }

@@ -375,6 +380,11 @@ njs_parser_reference(njs_vm_t *vm, njs_p
             return NULL;
         }

+        ret = njs_parser_variable_arguments(vm, parser);
+        if (nxt_slow_path(ret != NXT_OK)) {
+            return NULL;
+        }
+
         break;

     case NJS_TOKEN_NAME:
diff -r f41e47c8ac96 njs/test/njs_unit_test.c
--- a/njs/test/njs_unit_test.c  Tue Mar 26 00:42:39 2019 +0800
+++ b/njs/test/njs_unit_test.c  Tue Apr 02 23:24:17 2019 +0800
@@ -6566,6 +6566,9 @@ static njs_unit_test_t  njs_test[] =
     { nxt_string("var f = ()=>{}; f()"),
       nxt_string("undefined") },

+    { nxt_string("var f = () => 1; f.prototype"),
+      nxt_string("undefined") },
+
     { nxt_string("var f = ()=>({}); f()"),
       nxt_string("[object Object]") },

@hongzhidao

Agree. Looks better.

@hongzhidao

Updated patch series: https://gist.github.com/xeioex/c627bd9d14ac2d1c555a9cd3573189d9
Looks much better now.

if (!lambda->arrow) {
+        function->object.shared_hash = vm->shared->function_prototype_hash;
+    }

BTW, does not look like a right solution. Left as is until tomorrow.

2) I am still also not sure that njs_parser_arrow_expression() should be in njs_parser_terminal().
3) It looks like in njs_vmcode_arguments() we still have to scan the stack up until non arrow function.

@xeioex

A bit rework.
https://gist.github.com/hongzhidao/b45bca2b4bf5b082f8ce4871b5053352

I am still also not sure that njs_parser_arrow_expression() should be in njs_parser_terminal().

I think it's ok, we need to differentiate ( and arrow function, and obviously ( should be used in njs_parser_terminal(). Besides, I think arrow expression is a terminal token.

It looks like in njs_vmcode_arguments() we still have to scan the stack up until non arrow function.

It's better to build a unit test.

@xeioex

There are some separate patches about refactoring which are definite.

  1. Refactored njs_parser_object() and njs_parser_array().
  2. Refactored njs_parser_call_expression().
    https://gist.github.com/hongzhidao/f4da760f9bbb0f9f912bf478bcf872db

@hongzhidao

https://gist.github.com/a923318ea09ff0f8ee710f18796727d3

This a draft change. I am thinking how to avoid THIS instruction for normal this usage (in non-arrow functions). Also, I removed global this, as suggested in #115.

Updated patches: https://gist.github.com/9e3a7d9173309cf38f93e0488140380a
Still on arrow + this tests.

What is left:
1) a good set of arrow + this tests
2) parser issue

>>  var o = {f1: () => 1, f: function() {}}
SyntaxError: Unexpected token ":" in shell:1

3) arrow.prototype

@xeioex

parser issue

diff -r a82f8c642c68 njs/njs_parser.c
--- a/njs/njs_parser.c  Tue Mar 26 00:59:26 2019 +0800
+++ b/njs/njs_parser.c  Sat Apr 06 02:17:19 2019 +0800
@@ -2243,7 +2243,7 @@ njs_parser_arrow_expression(njs_vm_t *vm
     } else {
         parent = parser->node;

-        token = njs_parser_expression(vm, parser, token);
+        token = njs_parser_assignment_expression(vm, parser, token);
         if (nxt_slow_path(token <= NJS_TOKEN_ILLEGAL)) {
             return token;
         }

This object as a variable.
1) non-local this is introduced to support arrow functions.
this is non-local when reference scope != first non-arrow
function scope.
2) making global this undefined in accordance with ES6:15.2.
Treating main file a module code.

I can't get it. Can you show corresponding examples?
It seems that non-local this is a closure variable.
Is the main purpose to solve the below case introduced by arrow function?

() => { this }

BTW, what about the rework?

diff -r a82f8c642c68 njs/njs_generator.c
--- a/njs/njs_generator.c   Tue Mar 26 00:59:26 2019 +0800
+++ b/njs/njs_generator.c   Sat Apr 06 13:45:30 2019 +0800
@@ -429,7 +429,7 @@ njs_generator(njs_vm_t *vm, njs_generato

     case NJS_TOKEN_NAME:
     case NJS_TOKEN_ARGUMENTS:
-    case NJS_TOKEN_NON_LOCAL_THIS:
+    case NJS_TOKEN_UPWARD_THIS:
         return njs_generate_name(vm, generator, node);

     case NJS_TOKEN_NJS:
diff -r a82f8c642c68 njs/njs_lexer.h
--- a/njs/njs_lexer.h   Tue Mar 26 00:59:26 2019 +0800
+++ b/njs/njs_lexer.h   Sat Apr 06 13:45:30 2019 +0800
@@ -160,7 +160,7 @@ typedef enum {
     NJS_TOKEN_THROW,

     NJS_TOKEN_THIS,
-    NJS_TOKEN_NON_LOCAL_THIS,
+    NJS_TOKEN_UPWARD_THIS,
     NJS_TOKEN_GLOBAL_THIS,
     NJS_TOKEN_ARGUMENTS,

diff -r a82f8c642c68 njs/njs_parser.c
--- a/njs/njs_parser.c  Tue Mar 26 00:59:26 2019 +0800
+++ b/njs/njs_parser.c  Sat Apr 06 13:45:30 2019 +0800
@@ -2243,7 +2243,7 @@ njs_parser_arrow_expression(njs_vm_t *vm
     } else {
         parent = parser->node;

-        token = njs_parser_expression(vm, parser, token);
+        token = njs_parser_assignment_expression(vm, parser, token);
         if (nxt_slow_path(token <= NJS_TOKEN_ILLEGAL)) {
             return token;
         }
diff -r a82f8c642c68 njs/njs_parser.h
--- a/njs/njs_parser.h  Tue Mar 26 00:59:26 2019 +0800
+++ b/njs/njs_parser.h  Sat Apr 06 13:45:30 2019 +0800
@@ -231,11 +231,28 @@ njs_parser_global_scope(njs_vm_t *vm)


 nxt_inline njs_parser_scope_t *
-njs_function_scope(njs_parser_scope_t *scope, nxt_bool_t any)
+njs_function_any_scope(njs_parser_scope_t *scope)
 {
     while (scope->type != NJS_SCOPE_GLOBAL) {
-        if (scope->type == NJS_SCOPE_FUNCTION
-            && (any || !scope->arrow_function))
+
+        if (scope->type == NJS_SCOPE_FUNCTION) {
+            return scope;
+        }
+
+        scope = scope->parent;
+    }
+
+    return NULL;
+}
+
+
+nxt_inline njs_parser_scope_t *
+njs_function_scope(njs_parser_scope_t *scope)
+{
+    while (scope->type != NJS_SCOPE_GLOBAL) {
+
+        if (!scope->arrow_function
+            && scope->type == NJS_SCOPE_FUNCTION)
         {
             return scope;
         }
diff -r a82f8c642c68 njs/njs_parser_terminal.c
--- a/njs/njs_parser_terminal.c Tue Mar 26 00:59:26 2019 +0800
+++ b/njs/njs_parser_terminal.c Sat Apr 06 13:45:30 2019 +0800
@@ -216,36 +216,35 @@ njs_parser_reference(njs_vm_t *vm, njs_p
     case NJS_TOKEN_THIS:
         nxt_thread_log_debug("JS: this");

-        scope = njs_function_scope(parser->scope, 0);
+        scope = njs_function_scope(parser->scope);

-        if (scope != NULL) {
-            if (scope == njs_function_scope(parser->scope, 1)) {
-                node->index = NJS_INDEX_THIS;
-
-            } else {
-                node->token = NJS_TOKEN_NON_LOCAL_THIS;
-
-                node->token_line = token_line;
-
-                ret = njs_variable_reference(vm, parser->scope, node, name,
-                                             hash, NJS_REFERENCE);
-                if (nxt_slow_path(ret != NXT_OK)) {
-                    return NULL;
-                }
-
-                var = njs_variable_add(vm, scope, name, hash, NJS_VARIABLE_VAR);
-                if (nxt_slow_path(var == NULL)) {
-                    return NULL;
-                }
-
-                var->this_object = 1;
-            }
+        if (scope == NULL) {
+            node->token = NJS_TOKEN_GLOBAL_THIS;
+            node->u.value = njs_value_undefined;

             break;
         }

-        node->token = NJS_TOKEN_GLOBAL_THIS;
-        node->u.value = njs_value_undefined;
+        if (scope == njs_function_any_scope(parser->scope)) {
+            node->index = NJS_INDEX_THIS;
+
+        } else {
+            node->token = NJS_TOKEN_UPWARD_THIS;
+            node->token_line = token_line;
+
+            ret = njs_variable_reference(vm, parser->scope, node, name,
+                                         hash, NJS_REFERENCE);
+            if (nxt_slow_path(ret != NXT_OK)) {
+                return NULL;
+            }
+
+            var = njs_variable_add(vm, scope, name, hash, NJS_VARIABLE_VAR);
+            if (nxt_slow_path(var == NULL)) {
+                return NULL;
+            }
+
+            var->this_object = 1;
+        }

         break;

@@ -351,7 +350,7 @@ njs_parser_reference(njs_vm_t *vm, njs_p
     case NJS_TOKEN_ARGUMENTS:
         nxt_thread_log_debug("JS: arguments");

-        scope = njs_function_scope(parser->scope, 0);
+        scope = njs_function_scope(parser->scope);

         if (scope == NULL) {
             njs_parser_syntax_error(vm, parser, "\"%V\" object "

@xeioex @hongzhidao

>> (function() { return (()=> this).bind(this) })()()
Segmentation fault

also (fixed by the patch above):

>> var f0 = () => this, f1 = function() { return f0; }; f1()();
ReferenceError: "f1" is not defined in shell:1

is parsed as

var f0 = () => (this, f1 = function() { return f0; }); f1()();

deleted

>> (function() { return (()=> this).bind(this) })()()
Segmentation fault

The same as simple case. (It's ok without bind). On it.

function foo() {
    var t;
    return (()=> {
        console.log(t);
    }).bind(1)()
}

foo();

Well, I think it's an issue existed before. (tried in njs-0.2.0)
@drsm good reporting, can you help create a separate issue?

function foo() {
    var t = 2;

    function baz() {
        t = 3;
    }

    baz.bind()()
}

foo();

@hongzhidao

I can't get it. Can you show corresponding examples?

> function Car(){ this.age = 0; (() => { this.age++;})();}; (new Car()).age
1

> function Car(){ this.age = 0; (function(){ this.age++;})();}; (new Car()).age
TypeError: cannot get property "age" of undefined

Is the main purpose to solve the below case introduced by arrow function?

yes. See this.

Until arrow functions, every new function defined its own this value based on how the function was called:

A new object in the case of a constructor.
undefined in strict mode function calls.
The base object if the function is called as an "object method".

An arrow function does not have its own this. The this value of the enclosing lexical scope is used; arrow functions follow the normal variable lookup rules. So while searching for this which is not present in current scope they end up finding this from its enclosing scope.

@hongzhidao

BTW, probably we need to treat arrow functions a bit differently in call() or apply().

Invoked through call or apply Since arrow functions do not have their own this, the methods call() or apply() can only pass in parameters. thisArg is ignored.

I'm not sure, we've found a bug, it's important.

function foo() {
    var a;
    function baz() {
        a = 3;
    }
    baz.bind()() // but OK without bind
}
foo();

@xeioex

BTW, probably we need to treat arrow functions a bit differently in call() or apply().

Invoked through call or apply Since arrow functions do not have their own this, the methods call() or apply() can only pass in parameters. thisArg is ignored.

here some tests to consider:

    { nxt_string("(() => this)()"),
      nxt_string("undefined") },

    { nxt_string("(() => this).call('abc')"),
      nxt_string("undefined") },

    { nxt_string("(() => this).apply('abc')"),
      nxt_string("undefined") },

    { nxt_string("(() => this).bind('abc')()"),
      nxt_string("undefined") },

    { nxt_string("(function() { return (() => this); })()()"),
      nxt_string("undefined") },

    { nxt_string("(function() { return (() => this); }).call('abc')()"),
      nxt_string("abc") },

    { nxt_string("(function() { return (() => this); }).bind('abc')()()"),
      nxt_string("abc") },

    { nxt_string("(function() { return (() => this); })"
                 ".call('abc').call('bca')"),
      nxt_string("abc") },

#if 0
    { nxt_string("(function() { return (() => this); })"
                 ".call('abc').bind('bca')()"),
      nxt_string("abc") },
#endif

    { nxt_string("(function() { return function() { return () => this; }; })"
                 ".call('bca').call('abc')()"),
      nxt_string("abc") },

@xeioex

BTW, probably we need to treat arrow functions a bit differently in call() or apply().

No. Since this is processed well in arrow function.

  1. arguments[0] is passed by call() and apply().
  2. this of arrow function does not refer to argmenets[0] at all.

Updated patches.
https://gist.github.com/hongzhidao/b45bca2b4bf5b082f8ce4871b5053352

What is left:
arrow.prototype (need help @xeioex )

@drsm welcome to test. It's close to ready now.

@hongzhidao

What is left:
arrow.prototype (need help @xeioex )

Done in:
https://gist.github.com/576a6368cfb78e98e393d8175e1005c6

please review.

I plan to commit the patches this week.

Looks good, great work.

Was this page helpful?
0 / 5 - 0 ratings