Njs: get rid of global functions as lexer tokens

Created on 12 Apr 2019  路  18Comments  路  Source: nginx/njs

> function f() {}
undefined
> function toString() {}
SyntaxError: Unexpected token "toString" in shell:1
bug parser

All 18 comments

@xeioex

Are you on this? If not, I plan to do this.

@hongzhidao

I haven't started it yet.

@xeioex @drsm take a look.

The diff is used for discussion, I'll move it to gist.

  1. I tried in nodejs, and arguments is allowed to be redeclaration. Right?
  2. Is njs also allowed to be redeclaration?
  3. Introduced sub_token in njs_token_t, and property_token will be removed.
           token       sub_token
123        NUMBER
"abc"      STRING   
age        NAME        NAME
{age}      NAME        NAME
Number     NAME        NUMBER_CONSTRUCTOR
{Number}   NAME        NUMBER_CONSTRUCTOR
deleted.

@hongzhidao

Hi!

  • I tried in nodejs, and arguments is allowed to be redeclaration. Right?

No.

$ node --use-strict
> arguments
ReferenceError: arguments is not defined
> var arguments
var arguments
    ^^^^^^^^^

SyntaxError: Unexpected eval or arguments in strict mode

> function arguments() {}
function arguments() {}
         ^^^^^^^^^

SyntaxError: Unexpected eval or arguments in strict mode

@xeioex @drsm

Here's the patch.
https://gist.github.com/hongzhidao/e8c39e40cd7890632579d741d5e1ce99

Welcome to test.

@hongzhidao

Thank you for the patch. While it solves the original problem, I think we can do more.

In fact, according to the spec, global functions like setTimeout(), Object() ... should belong to global object. So we can add, delete and modify them later.

I think we can get rid of sub_token too.
1) before or after parser we should add global built-in functions into global var scope so we can find them while resolving variables. It will also simplify parser. Everything what is not keyword should be treated as NAME. // we can do this first
2) the same functions should also be available as global. // later

@xeioex Seems good, on it.

Check it again, what you mean is?

  1. preload global object like setTimeout(), Object() into scope->variables which type is NJS_SCOPE_GLOBAL, and these variables can be modified.
  2. get rid of sub_token, but what about property_token?
  3. later may we introduce global variable global, it is an object.
  4. Is njs a global object?
  1. preload global object like setTimeout(), Object() into scope->variables which type is NJS_SCOPE_GLOBAL

yes

and these variables can be modified.

I guess, this will be harder to do now, we can do it later.

get rid of sub_token, but what about property_token?

property_token is a way to parser object literals, Ideally we should eliminate it also to simplify the parser. It would be great if you find a way to to this, but I think it should be treated as a separate issue.

  1. later may we introduce global variable global, it is an object.

global this is a global object. Now it mostly is empty. Note, this is not the same as this inside functions.

>> this
{

}
>> this.a
undefined
>> this.a = 1
1
>> this.a
1
>> this.NaN 
NaN

Is njs a global object?

currently it is not, but maybe we should merge into global this . @drsm ??

@hongzhidao

  1. preload global object like setTimeout(), Object() into scope->variables which type is NJS_SCOPE_GLOBAL, and these variables can be modified.

Yes.

$ node 
> Object = false
internal/util.js:32
  return Object.prototype.toString.call(o);
                          ^
TypeError: Cannot read property 'toString' of undefined
    at objectToString (internal/util.js:32:27)
...
  1. later may we introduce global variable global, it is an object.

It's an exotic (the most) object that acts as two way view between var declared part of the global scope and the object instance, so the proper implementation will be non-trivial:

$ node --use-strict
> void Object.defineProperty(this, 'ttt', { value: 'asdf' } );
undefined
> ttt
'asdf'
> ttt = 'fdsa'
TypeError: Cannot assign to read only property 'ttt' of object '#<Object>'
> var ttt
undefined
> ttt
'asdf'
> ttt = 'fdsa'
TypeError: Cannot assign to read only property 'ttt' of object '#<Object>'
> let ttt
SyntaxError: Identifier 'ttt' has already been declared
> void Object.defineProperty(this, 'rrr', { value: 'asdf' } );
undefined
> rrr
'asdf'
> let rrr = 'fdsa'
SyntaxError: Identifier 'rrr' has already been declared
> this.ppp = 'asdf'
'asdf'
> ppp
'asdf'
> let ppp = 'fdsa'
undefined
> ppp
'fdsa'
>

@xeioex

Take a look again.
https://gist.github.com/hongzhidao/e8c39e40cd7890632579d741d5e1ce99

  1. Note that eval is special, it can't be redeclared. So I regard it as a special token.
  2. I put hmac and hash prototypes to the first position.
    Is it OK? Since I don't know why they are put in the middle before.

  3. > property_token is a way to parser object literals, Ideally we should eliminate it also to simplify the parser. It would be great if you find a way to to this, but I think it should be treated as a separate issue.
    > global object this.

Do them later, it's better to check the design first.

@hongzhidao

Thanks. Looks much better now. Will look into it later.

I put hmac and hash prototypes to the first position.
Is it OK? Since I don't know why they are put in the middle before.

I would put in after all the standard objects (errors). IMHO, In the beginning it looks strange.

@xeioex

I would put in after all the standard objects (errors).

Think about this.

enum njs_constructor_e {
    ...

    NJS_CONSTRUCTOR_TYPE_ERROR =     NJS_PROTOTYPE_TYPE_ERROR,
    NJS_CONSTRUCTOR_URI_ERROR =      NJS_PROTOTYPE_URI_ERROR,
    /* MemoryError has no its own prototype. */
    NJS_CONSTRUCTOR_MEMORY_ERROR,
    /* here ? */
#define NJS_CONSTRUCTOR_MAX    (NJS_CONSTRUCTOR_MEMORY_ERROR + 1)
};

@hongzhidao

/* here ? */

yes.

@hongzhidao @xeioex

Take a look again.
https://gist.github.com/hongzhidao/e8c39e40cd7890632579d741d5e1ce99

+    { nxt_string("var undefined;"),
+      nxt_string("SyntaxError: Unexpected token \"undefined\" in 1") },

Actually, undefined is a variable defined as this['undefined']:

> Object.getOwnPropertyDescriptor(this, 'undefined')
{ value: undefined,
  writable: false,
  enumerable: false,
  configurable: false }
> undefined = false
TypeError: Cannot assign to read only property 'undefined' of object '#<Object>'
> var undefined
undefined
> (() => { var undefined = false; console.log(undefined); })()
false
undefined

But we need a working global this there.

@xeioex

  1. style.
  2. put hmac and hash after all the standard objects (errors).
    https://gist.github.com/hongzhidao/e8c39e40cd7890632579d741d5e1ce99

@xeioex @drsm

About global object redeclaration.

  1. I think it should not affect the global constructors that have been built in.
    vm->scopes[NJS_SCOPE_GLOBAL] -> NJS_INDEX_GLOBAL_OFFSET
  2. But in module mode, functions cant be redeclared. And we have added global objects into the global scope. So what will happen as below?
// in module mode
function Function() {}  // Is it allowed?

About global object this, I haven't deep into it yet.
In Lua, actually, there are no global variables, all the global variables are treated as _G['var_name'].
_G is a table. Do you have clear thought about this in NJS? @xeioex

@hongzhidao

https://gist.github.com/hongzhidao/e8c39e40cd7890632579d741d5e1ce99

While it is much better the what we have before, I am still not happy with with what we get as the result.

IMHO, the current code is a bit ugly and inflexible with all those NJS_CONSTRUCTOR_*, NJS_PROTOTYPE_* typedefs.

I suggest the following scheme:
1) All global functions should be copied to shared_hash of global_this object in njs_builtin_objects_create() in the same way it is done for prototypes.
2) in njs_generate_name() if node->scope->type == NJS_SCOPE_GLOBAL we should generate

node->u.operation = njs_vmcode_property_get;                                                                             
node->left = // < global this >
node->right =  // <var name>
node->index = njs_variable_index();// ?? not sure 

so in effect, global vars will always be queried from global this

var a =1, b =2
function f(c) {
 return a + b + c; // => < global this>.a + <global this>.b + c
}

@drsm what do you think about ^^^ ?

@hongzhidao do you have other ideas?

P.S.
Later, using this way,

> void Object.defineProperty(this, 'ttt', { value: 'asdf' } );
undefined
> ttt
'asdf'

will be easy to do. Moreover, after https://hg.nginx.org/njs/rev/a84f514e864d, deleting should work out of the box.

P.P.S.

Related, overlapping problem #150. ReferenceError should be thrown in runtime, currently it is thrown in compile time.

If we cannot resolve var name in compile time

  1. generate property get from global this
  2. if property get fails, throw ReferenceError().

@xeioex

so in effect, global vars will always be queried from global this

var a =1, b =2
function f(c) {
 return a + b + c; // => < global this>.a + <global this>.b + c
}

@drsm what do you think about ^^^ ?

looks good!

but regarding #105 there will be more changes, as lexically declared stuff doesn't alter the global this:

root@node:~# node 
> const a = 1; let b = 2;
undefined
> console.log(this.a, this.b)
undefined undefined
undefined

including functions in module mode:

root@node:~# eshost -x 'var gthis = new Function("return this")(); function a() {} print(typeof gthis.a)'
#### ch
function

#### jsc
function

#### sm
function

#### v8
function

#### xs
function
root@node:~# eshost -m -x 'var gthis = new Function("return this")(); function a() {} print(typeof gthis.a)'
#### ch


#### jsc
undefined

#### sm
undefined

#### v8
undefined

#### xs
undefined

and names generated by imports

Was this page helpful?
0 / 5 - 0 ratings