Chapel: [Mason] Illegal characters used as identifier result in unbuildable default project.

Created on 6 Dec 2017  ·  9Comments  ·  Source: chapel-lang/chapel

mason new chpl-project will create a new module named "chpl-project". The dash is an illegal character and the project will not build. Other illegal characters include the period (.).

Potential solutions

  • Convert all illegal characters to legal characters, such as an underscore. May affect name-based path traversals in Mason.
  • Truncate the name (Cargo takes this approach when it finds a hyphen), but this gets into corner-case situations. For example: mason new chpl-myproject --> module myproject, but what should my-project-name do?
  • Forbid creating the project with an illegal character used as an identifier. NOTE: There will be a difference between modules (module <name>) and applications (proc main() without direct use for the project name.
Tools Design Unimplemented Feature user issue

Most helpful comment

  1. Add an option like --moduleName. I'm neutral on a new option since it doesn't have a straightforward value proposition (e.g., applications can already write all of their code in separate modules that are not tied to the Mason project name).

The value would be a cleaner way to guide users from doing mason new chpl-project to creating a legal top-level module name that successfully builds. What alternative do you have in mind? I could imagine printing a warning that the user will need to rename their top-level module before building, because chpl-test contains an illegal character for module names.

  1. ....
    Is the project name in Mason.toml used for anything (and does it have any illegal-character issues, esp. w.r.t. a Mason registry)?

At the time of writing, the name field must match the project root directory name and the Chapel source file that will be compiled upon calling mason build, e.g.

> mason new chpl-project

chpl-test
├── Mason.toml

    [brick]
    name = "chpl-project"
    ...

└── src
    └── chpl-project.chpl

This is used for determining module paths when building, but could be separated in the future if necessary - update: this has been changed to no longer rely on the directory name in #8831. In any case, the package name is not impacted by the illegal character issues.

  1. ...
    But if this is important for people, it can be added because it relaxes the steps above.

I think this is something we could react to if users request it frequently. An advantage of being more restrictive now is that it is easier to become less restrictive in the future than go the other direction.

All 9 comments

My preference is to avoid modifying the module name implicitly. Therefore, I find this potential solution most appealing:

  • Forbid creating the project with an illegal character used as an identifier. NOTE: There will be a difference between modules (module ) and applications (proc main() without direct use for the project name.

However, I think we could still support package names with illegal characters by only forbidding projects with an illegal character if a separate module name is not explicitly provided, e.g.

> mason new chpl-project
Error: Illegal character in module name. Try: --moduleName=<legal module name>

> mason new chpl-project --moduleName=project
Creating package: chpl-project

I can see the appeal for package names differing from module names for applications / command-line tools, but I am not sure the same is true for library packages (which is all that mason supports today). Is there a motivating use-case for the latter?

Also, I wonder if it would be worth special-casing something like chpl-* to * to handle the potential package naming convention. It may be too soon to tell.

We can take it in steps since these all build on each other from most restrictive to least restrictive:

  1. Forbid illegal characters in module name. I'm okay with this.
  1. Add an option like --moduleName. I'm neutral on a new option since it doesn't have a straightforward value proposition (e.g., applications can already write all of their code in separate modules that are not tied to the Mason project name). Is the project name in Mason.toml used for anything (and does it have any illegal-character issues, esp. w.r.t. a Mason registry)?

  2. Transform the project name to make it a legal module name. I'm against this since you'll have to document a transformation and ensure it works (it's also not straightforward behavior). But if this is important for people, it can be added because it relaxes the steps above.

  1. Add an option like --moduleName. I'm neutral on a new option since it doesn't have a straightforward value proposition (e.g., applications can already write all of their code in separate modules that are not tied to the Mason project name).

The value would be a cleaner way to guide users from doing mason new chpl-project to creating a legal top-level module name that successfully builds. What alternative do you have in mind? I could imagine printing a warning that the user will need to rename their top-level module before building, because chpl-test contains an illegal character for module names.

  1. ....
    Is the project name in Mason.toml used for anything (and does it have any illegal-character issues, esp. w.r.t. a Mason registry)?

At the time of writing, the name field must match the project root directory name and the Chapel source file that will be compiled upon calling mason build, e.g.

> mason new chpl-project

chpl-test
├── Mason.toml

    [brick]
    name = "chpl-project"
    ...

└── src
    └── chpl-project.chpl

This is used for determining module paths when building, but could be separated in the future if necessary - update: this has been changed to no longer rely on the directory name in #8831. In any case, the package name is not impacted by the illegal character issues.

  1. ...
    But if this is important for people, it can be added because it relaxes the steps above.

I think this is something we could react to if users request it frequently. An advantage of being more restrictive now is that it is easier to become less restrictive in the future than go the other direction.

PR #8941 added some checking on the package name. But this issue discusses enough other ideas (that are perhaps follow on steps) that I didn't feel comfortable closing it. @ben-albrecht - could you close this issue if you think it's appropriate to do so?

I'm going to opt for leaving this open, until we pursue the other ideas listed here or decide they aren't worth pursuing.

I wonder whether someone would summarize the other ideas that we're keeping this open for—each time I look at this issue, I wonder what they are, but then don't find the time to figure it out for myself.

I'm going to close this issue and document the desire for a --moduleName flag in https://github.com/chapel-lang/chapel/issues/7106 for now.

The value would be a cleaner way to guide users from doing mason new chpl-project to creating a legal top-level module name that successfully builds. What alternative do you have in mind?

The alternative I had in mind was to have detection logic for illegal characters and just forbid the project name outright. This would be a way to force a coding or packaging convention such as mason init NewProject instead of new-project. But that also depends on what https://github.com/chapel-lang/chapel/issues/7417 says is the de facto style.

The alternative I had in mind was to have detection logic for illegal characters and just forbid the project name outright.

This is the behavior as of master today:

$ mason new foo-bar
Bad package name 'foo-bar' - only Chapel identifiers are legal package names

https://github.com/chapel-lang/chapel/issues/14955 lets a user create a package in a different directory than the package name, but the package name must still be a legal Chapel identifier.

Was this page helpful?
0 / 5 - 0 ratings