Bisect_ppx, the coverage tool, now works with BuckleScript (instructions, starter repo). It needs a bit of help from BuckleScript, however, for the integration to be truly painless.
I opened a new issue to keep the text together, and also because https://github.com/BuckleScript/bucklescript/issues/3482#issuecomment-516399766 gave me the impression that #3715 is being suggested as an alternative, but Bisect_ppx actually needs both supported, so this is a different situation.
(#3482) There should be per-directory ppx-flags in bsconfig.json. Right now, ppx-flags gets applied to the whole project, which causes the test cases to be instrumented, which is very noisy in the coverage report. The current workaround for this is to manually exclude the tests, but this is awkward.
For comparison, this is exactly how Bisect has been working with Dune for years. src/ directories get preprocessed, but test/ directories do not.
(#3715) It should be possible to mark some ppx-flags as dev, so that the PPX won't even be run in release builds. If this is not done, Bisect_ppx becomes a hard dependency of the project, which has to be removed manually each time a release is tagged.
For comparison, this is currently a problem with Dune as well (ocaml/dune#57), but we plan to solve it shortly along the same lines (https://github.com/ocaml/dune/issues/57#issuecomment-518561580). So far, because of the lack of dev-only PPX support, this is the kind of cleanup that has to be done on each release of a project that uses Bisect_ppx: https://github.com/aantron/markup.ml/commit/ea68bebf5c3a19f56350393e359d444f864154e3#diff-d218652a79a651b9be8eee7641ea0893L4
Hi @anntron, it is nice to see bisect works with bucklescript.
The ppx per directory is easy to implement, but it is hard to generate proper merlin file, how does dune solve this issue?
IIRC, there are three things from bisect_ppx, the ppx tool, ppx runtime, and ppx coverage report analzyer, do you think it is possible to subsume into bs-platform directly (mostly ppx runtime, so it can build directly?).
@bobzhang, can you give some detail on what is the difficulty with the .merlin files? I didn't see anything in the linked issues, and I can't think of what the problem would be.
It's possible to subsume Bisect into bs-platform directly, but how would you like to do it? You are correct about the pieces it has.
Ideally, if it's possible to keep Bisect separate, that would be best for administration, maintenance, and release cycle reasons. There is one advantage to integrating Bisect into bs-platform, which is we can make sure that Bisect's PPX tool always matches the version of OCaml in bs-platform more easily that way. But I also opened https://github.com/ocaml-ppx/ocaml-migrate-parsetree/issues/78 about conclusively solving that on the Bisect end in a way that doesn't depend on bs-platform.
On Aug 19, 2019, at 9:07 PM, Anton Bachin notifications@github.com wrote:
@bobzhang https://github.com/bobzhang, can you give some detail on what is the difficulty with the .merlin files? I didn't see anything in the linked issues, and I can't think of what the problem would be.
How do you config .merlin to have various ppx flags for each directory?
It's possible to subsume Bisect into bs-platform directly, but how would you like to do it? You are correct about the pieces it has.Depends on the size of the project, if it is small, maybe worth venturing, otherwise, as you said, keep it separate works better
Ideally, if it's possible to keep Bisect separate, that would be best for administration, maintenance, and release cycle reasons. There is one advantage to integrating Bisect into bs-platform, which is we can make sure that Bisect's PPX tool always matches the version of OCaml in bs-platform more easily that way. But I also opened ocaml-ppx/ocaml-migrate-parsetree#78 https://github.com/ocaml-ppx/ocaml-migrate-parsetree/issues/78 about conclusively solving that on the Bisect end in a way that doesn't depend on bs-platform.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/BuckleScript/bucklescript/issues/3761?email_source=notifications&email_token=AAFWMK64EHP4LDWENM3UQLTQFKLKRA5CNFSM4IMO7VL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4S3TAA#issuecomment-522566016, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFWMK6W6SRKIUXGEKHDP4LQFKLKRANCNFSM4IMO7VLQ.
How do you config .merlin to have various ppx flags for each directory?
You can place a .merlin file in each directory. At least, that's what Dune does in each of my projects.
Ah, I did not know that. Thanks for the info
Bisect_ppx is now out for BuckleScript and available as an npm package, so I'd like to ping this issue again. It would be very nice to resolve this for a fully ergonomic integration; then I will add instructions to the Bisect README about where the Bisect PPX flags should go in bsconfig.
an option to have a dev-mode bsconfig.json, or something similar as discussed in #3716, would also help with this
In case we go with the idea of specifying the bsconfig.json, we still need at least #3482 (per-directory PPX flags) to avoid the annoying workaround of having to specify [coverage exclude_file] in the test suite and any other files that are not part of the code under test. There are two reasons this is annoying:
It's much cleaner to be able to opt into instrumentation of the files that need it by directory, rather than try to individually opt out in every file that shouldn't get it.
@aantron I think the best way to do this is having an environment variable so when it is set, bisect will be instrumented, bisect can pick up some convention to not instrument code in test files, what do you think?
We already have an environment variable like this, and have to use it. See here, step 4.
This doesn't work, because the project gains a hard, non-dev dependency on Bisect, which either stays, or has to be removed manually or by fragile script during release.
There is no reason why downstream users of releases should transitively depend on Bisect. In particular, having downstream users depend on Bisect means installing binaries of a PPX, which are large, could be missing, could trigger a build from source code, etc. It's much better if all of that risk can be avoided.
We've had an analogous discussion in Dune in ocaml/dune#57 and elsewhere. In fact, we introduced BISECT_ENABLE as the result of discussions after the release of Dune, since Ocamlbuild hadn't had issues with Bisect. After trying BISECT_ENABLE for a short while, the consensus quickly developed between me, users, and Dune developers, that we need better support for it than that, in particular we need Dune to have a notion of conditional or development PPX.
The convention about test files also is questionable, see ocaml/dune#57 again for some of the discussions on that.
To put it another way, when the environment variable that currently enables Bisect is not set, that still doesn't remove the permanent dependency on Bisect from bsconfig.json. The project's release just ends up with an unnecessary dependency on a PPX, that is a no-op.
There is also an issue with dependency instrumentation (which we don't want). If BISECT_ENABLE is set, it should not trigger instrumentation of dependencies that also were using Bisect in their development.
And note that bs-dev-dependencies is not sufficient for this, because the PPX flags cannot be made dev-only at the moment.
Clone the starter repo to quickly get started with a Bisect setup, and see what usage is like, and experiment.
based on the conversation off-line, the minimal changes to make it work:
--exclude-files which excludes some test filesThis will work for Bisect.
I assume this is included, but ppx-dev-flags should not be triggered transitively when the project is installed as a dependency of another project, which might also have ppx-dev-flags and pass the enabling flag to bsb in its development.
ppx-dev-flags should not be triggered transitively
Yes, it is only for toplevel projects, since ppx-dev-flags are dev dependencies, when it is used as a library, it may not be installed by the user
There is one concern with implementing --exclude-files. IIRC the locations generated by BuckleScript in the AST are absolute paths, which complicates the interpretation of the paths passed with --exclude-files, as the user will want to specify project-relative paths.
Is there some reliable way of computing form these paths that are relative to the project root? For example, perhaps bsb puts the project root path into some environment variable before calling the PPX, or bsb guarantees that current directory at the time the PPX is called is the project root?
uncurry-703$npx bsb -- -t commands src/demo.cmj
/Users/hongbozhang/git/uncurry-703/node_modules/bs-platform/darwin/bsc.exe -warn-error +101 -color always -o src/demo.reast -bs-syntax-only -bs-binary-ast /Users/hongbozhang/git/uncurry-703/src/demo.re
/Users/hongbozhang/git/uncurry-703/node_modules/bs-platform/darwin/bsb_helper.exe -hash 138916fd68ea48b09eb00565dc2b654d -g 0 src/demo.reast
/Users/hongbozhang/git/uncurry-703/node_modules/bs-platform/darwin/bsc.exe -bs-package-name uncurry-703 -bs-package-output commonjs:src -color always -bs-suffix -I src -warn-error +101 -o src/demo.cmj src/demo.reast
we are already doing this? I suggest you accept a regex instead of string
Ah, you mean generation .reast -- will have a look if we can make it relative
Yes, because the PPX has to get the original source filename from the locations found inside the AST, because the command line arguments passed to the PPX are some temporary files.
Sorry that I am a bite late on this.
Thinking about the implementation today.
I am a bit hesitant that adding three things for a niche use case.
Note you can work around this by doing some scripts around bsconfig.json.
For this particular tool, the js ecosystem already has similar tools https://github.com/BuckleScript/bucklescript/blob/master/package.json#L18 it's more precise since your ppx instrumentation would change the compiler behavior significantly, e.g, inlining behavior will be different when code instrumented
Inlining does not affect which source code is "executed," so there is no difference in precision of source code coverage info that would come from instrumentation. As for other differences, such as performance, the user shouldn't and wouldn't be profiling or releasing code that is instrumented for coverage.
See this PR for an example of an important Reason project that switched from JS tools to Bisect and benefited from it: https://github.com/reazen/relude/pull/261#issuecomment-617032143. As you can see from the comments, using Bisect for coverage makes the report cleaner and more relevant. Adding my opinion, considerably so.
JS tools are less precise than ML source-aware tools, since they do not have information about what is interesting in the source language. In addition, Bisect is sensitive to factors like whether function calls are in tail position to insert instrumentation correctly, among many others. Adding yet another issue, ML is an expression-based language and only Bisect is sensitive to this, while many JS tools are statement- or line-based, another cause of their loss of precision.
I don't mind any way of solving this issue at this point, as long as it is basically usable by users, like discussed above.
The tool will be a little bit less niche if you will add an ergonomic way to use it; there are already users waiting for this. Right now people rightly hesitate to use Bisect because it requires an extra step during releasing to remove the false non-dev dependency created by BuckleScript on a dev tool, or else to cause all end users to also install the dev tool. Please make this slightly easier for maintainers in any way, the least intrusive way for BuckleScript.
Here is another example of a well-known project suffering from this: https://github.com/Risto-Stevcev/bastet/issues/26
In summary, Bisect is the desirable way to do coverage in an ML project, which the most thorough developers of high quality projects want to do. It already has many integration features for BuckleScript, Jest, etc., to make usage as painless as possible, to the extent possible from Bisect's end. Bisect is currently mainly suffering from a usability bottleneck in BuckleScript itself, namely the one described in this issue about a hard release dependency on a dev tool.
cc @jihchi @mlms13 @andywhite37 @Risto-Stevcev @amsross @wyze
I created a small utility that modifies the bsconfig.json, that I only run in CI environments, to add required information for bisect_ppx to work. wyze/conditional_bisect
Solving https://github.com/BuckleScript/bucklescript/issues/3716 with e.g. bsconfig.dev.json would go most of the way to solving this one.
Point (1) also looks like it could be solved by @anmonteiro 's 'monorepo' appraoach (see https://github.com/anmonteiro/bucklescript-monorepo ); it would set up the test directory as a separate BuckleScript project with its own bsconfig.json with separate settings.
Most helpful comment
Inlining does not affect which source code is "executed," so there is no difference in precision of source code coverage info that would come from instrumentation. As for other differences, such as performance, the user shouldn't and wouldn't be profiling or releasing code that is instrumented for coverage.
See this PR for an example of an important Reason project that switched from JS tools to Bisect and benefited from it: https://github.com/reazen/relude/pull/261#issuecomment-617032143. As you can see from the comments, using Bisect for coverage makes the report cleaner and more relevant. Adding my opinion, considerably so.
JS tools are less precise than ML source-aware tools, since they do not have information about what is interesting in the source language. In addition, Bisect is sensitive to factors like whether function calls are in tail position to insert instrumentation correctly, among many others. Adding yet another issue, ML is an expression-based language and only Bisect is sensitive to this, while many JS tools are statement- or line-based, another cause of their loss of precision.
I don't mind any way of solving this issue at this point, as long as it is basically usable by users, like discussed above.
The tool will be a little bit less niche if you will add an ergonomic way to use it; there are already users waiting for this. Right now people rightly hesitate to use Bisect because it requires an extra step during releasing to remove the false non-dev dependency created by BuckleScript on a dev tool, or else to cause all end users to also install the dev tool. Please make this slightly easier for maintainers in any way, the least intrusive way for BuckleScript.
Here is another example of a well-known project suffering from this: https://github.com/Risto-Stevcev/bastet/issues/26
In summary, Bisect is the desirable way to do coverage in an ML project, which the most thorough developers of high quality projects want to do. It already has many integration features for BuckleScript, Jest, etc., to make usage as painless as possible, to the extent possible from Bisect's end. Bisect is currently mainly suffering from a usability bottleneck in BuckleScript itself, namely the one described in this issue about a hard release dependency on a dev tool.
cc @jihchi @mlms13 @andywhite37 @Risto-Stevcev @amsross @wyze