Rescript-compiler: Recursive module field resolving order

Created on 5 Nov 2019  路  18Comments  路  Source: rescript-lang/rescript-compiler

Not so minimal repro (but still small):
https://github.com/wokalski/bucklescript-recursive-modules-bug

Running node ./src/Test.bs.js throws

Undefined_recursive_module,-11,nars_struct.ml,88,10

I have roughly inspected the compiled output and I found multiple issues there. You'll quickly spot the problems, too.

Note; I think there's also a bug with the namespace. It's defined as Ocaml_protoc_plugin in the dependency's (ocaml-protoc-plugin-runtime-bs) bsconfig but it's accessible as Ocamlprotocplugin.

bug non-critical-bug

Most helpful comment

Because the let binding in your example is a structure item and in my example it's a subexpression in a structure item and they seem to be resolved lazily.

All 18 comments

I just tried downgrading to 6.1.x, unfortunately there's a different issue there, but the result is the same; another runtime error.

Which version did you find this issue

6.2.1, 6.1.0, 6.2.2-flambeau-array - I tested all three

I have roughly inspected the compiled output and I found multiple issues there. You'll quickly spot the problems, too.

It has some 3rd party deps, would you shed some light so it would save me some time? thanks

I can remove the dependency tomorrow. the file Nars_struct.ml is compiled incorrectly. The dependency is irrelevant - it鈥檚 not even called by the problematic module by time the exception is thrown.

Nars_struct is generated by ocaml-protoc-plugin. It generates a recursive module. The functions are generated correctly in Nars_struct.bs.js but Caml_module.update_mod doesn鈥檛 resolve/update the module correctly so the resulting modules fail with the exception.

I hoped it was a trivial bug, but it's not. I've found the issue.

The generated code looks like this (link to source):

        let to_proto = 
          let apply = fun ~f (a, b) -> f a b in
          let spec = Ocaml_protoc_plugin.Serialize.C.( basic (1, string, proto3) ^:: basic_opt (2, (message Value.to_proto)) ^:: nil ) in
          let serialize = Ocaml_protoc_plugin.Serialize.serialize (spec) in
          fun t -> apply ~f:(serialize ()) t

The generated code for spec is defined as so (link to source):

var spec = Curry._2(Spec$Ocamlprotocplugin.Serialize.C.$caret$colon$colon, Curry._1(Spec$Ocamlprotocplugin.Serialize.C.basic, /* tuple */[
          1,
          Spec$Ocamlprotocplugin.Serialize.C.string,
          Spec$Ocamlprotocplugin.Serialize.C.proto3
        ]), Curry._2(Spec$Ocamlprotocplugin.Serialize.C.$caret$colon$colon, Curry._1(Spec$Ocamlprotocplugin.Serialize.C.basic_opt, /* tuple */[
              2,
              Curry._1(Spec$Ocamlprotocplugin.Serialize.C.message, Value.to_proto)
            ]), Spec$Ocamlprotocplugin.Serialize.C.nil));

On this line:

              Curry._1(Spec$Ocamlprotocplugin.Serialize.C.message, Value.to_proto)

Value.to_proto references a function that hasn't been initialised yet.

Here's a minimal repro:

module rec PA : sig
  val print: int array -> unit
end = struct
  let print =
    let iter = Array.iter P.print in
    fun a -> iter a
end
and P : sig
  val print: int -> unit
end = struct
  let print i =
    print_endline (string_of_int i)
end

let () =
  PA.print [|1; 2|]
// Generated by BUCKLESCRIPT, PLEASE EDIT WITH CARE
'use strict';

var $$Array = require("bs-platform/lib/js/array.js");
var Curry = require("bs-platform/lib/js/curry.js");
var Caml_module = require("bs-platform/lib/js/caml_module.js");

var PA = Caml_module.init_mod([
      "Test.ml",
      3,
      6
    ], [[[
          0,
          "print"
        ]]]);

var P = Caml_module.init_mod([
      "Test.ml",
      10,
      6
    ], [[[
          0,
          "print"
        ]]]);

var partial_arg = P.print;

function print(a) {
  var param = a;
  return $$Array.iter(partial_arg, param);
}

Caml_module.update_mod([[[
          0,
          "print"
        ]]], PA, {
      print: print
    });

function print$1(i) {
  console.log(String(i));
  return /* () */0;
}

Caml_module.update_mod([[[
          0,
          "print"
        ]]], P, {
      print: print$1
    });

Curry._1(PA.print, /* array */[
      1,
      2
    ]);

exports.PA = PA;
exports.P = P;
/* PA Not a pure module */

The semantics here is arguable, actually this is kind of abusing recursive modules, we will see if we can have a clean solution. Below also raise:

module rec PA : sig
  val print: int array -> unit
end = struct
  let () = P.print 3 
  let print =
    let iter = Array.iter P.print in
    fun a -> iter a
end
and P : sig
  val print: int -> unit
end = struct
  let print i =
    print_endline (string_of_int i)
end

let () =
  PA.print [|1; 2|]

The point of recursive module is that don't do eager evaluation, otherwise there may be or may be not runtime exceptions.

can you confirm it is fixed in #3935 ? Note in master branch we compiled records into objects which may disrupt your code when testing, but such test is greatly appreciated

I suggest the codegen be more conservative when using recursive module, here is another example that it raises in native:

module rec PA : sig
  val u : int -> unit
  val print: int array -> unit
end = struct
  let u = P.print 
  let print =
    let iter = Array.iter P.print in
    fun a -> iter a
end
and P : sig
  val print: int -> unit
end = struct
  let print i =
    print_endline (string_of_int i)
end

let () =
  PA.print [|1; 2|]

let () = PA.u 3

Thanks for fixing this @bobzhang. I will contact the library authors about this 馃憤

This unfortunately doesn't work with bucklescript but does with native:

module rec PA : sig
  val print: int array -> unit
end = struct
  let print =
    let p = P.print in
    let iter = Array.iter p in
    fun a -> iter a
end
and P : sig
  val print: int -> unit
end = struct
  let print i =
    print_endline (string_of_int i)
end

let () =
  PA.print [|1; 2|]

I know you don't like this coding style, recursive modules are generally confusing but it'd be great to have the compiler work the same for bs and native.

And this is more representative of the real-world issue that I'm encountering:

let iter ~f =
  print_endline "foo";
  fun i -> Array.iter f i

module rec PA : sig
  val print: int array -> unit
end = struct
  let print =
    let iter = iter ~f:P.print in
    fun a -> iter a
end
and P : sig
  val print: int -> unit
end = struct
  let print i =
    print_endline (string_of_int i)
end

let () =
  PA.print [|1; 2|]

@wokalski We will try our best to preserve the same semantics, but it is hard to explain why my example raise while your example does not raise? So what's the rule here

Because the let binding in your example is a structure item and in my example it's a subexpression in a structure item and they seem to be resolved lazily.

And thank you @bobzhang. It goes without saying that you're doing an amazing job. Even though the fix yesterday didn't fix the bug I had in the end, I am astounded by the pace of changes. I'd say keep it up, but maybe you're moving too fast 馃槃.

Oh, and I tried objects as records. I really like the change.

hi @wokalski I am glad you find the work around.
For both native/js backend, the recurisve module is slow so that it does not justify the risk of evaluation order.
I created a separate issue that would improve the recursive module performance.
https://github.com/BuckleScript/bucklescript/issues/4434
we will close this issue since the benefit does not justify the complexity

Was this page helpful?
0 / 5 - 0 ratings