This is related to #13523 and #13524.
If we have a module and a submodule, like this:
module M {
module L {
proc lFunction() { }
}
L.lFunction(); // here
}
Should a use/import be required for L.lFunction() to work?
L.lFunction() is not allowed without a use L / import LL.lFunction() works without a use L / import LArguments in favor of "yes":
use and import actually the only things that interact directly with module names - although some of these would then re-export the module name. This might make it easier/more robust to do things like import M as OtherModule.import / use, submodule or no.import / use when the submodule is defined in another file (a-la #13524) and this would give a clear place to indicate public import vs private import.Arguments in favor of "no":
module L declaration creates a symbol that is visible and in scope within M. Changing this would make the module declarations less consistent with variable, type, and proc declarations.M using things in L to require a use but not for L using things in M to do so, e.g.:module L {
proc lFunction() { }
... M.mFunction() ... mFunction() ... // can see mFunction because of nesting
}
import L;
L.lfunction(); // not legal without the import above?
}
```
Regarding this reason for "yes"
- it makes modules more consistent, because they always require
import/use, submodule or no.
Early in my history of learning Chapel, I was personally pretty confused by the fact that putting many modules in a single file actually makes them submodules within an implicit module, i.e.
// in M.chpl
module L {
proc lFunction() { }
}
module N {
proc main() {
L.lFunction(); // why does this work?
}
}
I remember being very surprised by L.lFunction() working, since if L and N were in separate files it would not work. It works because M.chpl makes an implicit module M, which contains L and N. Therefore N can see L through its parent module. I think new Chapel programmers regularly don't understand the implicit file-level module.
Choosing "yes" here would reduce the potential for this particular case of confusion. (another strategy to handle that case of confusion would be to create more errors for implicit modules containing certain patterns - as with #6813 - e.g. implicit modules that contain only multiple module statements).
I'm following this conversation, but I'm not sure I have a clear opinion just yet
My immediate response was to draw parallels from enums and what use means for them. I understand that it is somewhat different, but I think they should be more or less consistent.
enum TestEnum { Foo, Bar }
writeln(TestEnum.Foo); // you can do this without `use`
use TestEnum; //You need the `use` to use the enum without its name
writeln(Foo);
So, maybe a module could use its submodules functions as long as the submodule name is there, but with use/import they can just directly use it?
@e-kayrakli - I don't think it makes any sense to import an enum; the only useful thing to do is use it to bring the symbols within the enum into scope. I've started thinking that perhaps we want a difference between import and use that would arguably address your consistency desire here with the enum case.
import L; would make the module L visible itself, so you could e.g. L.someFunction. In contrast, I'm thinking use L; would bring the contents of L into scope without affecting whether or not L itself was visible.
In that event, if you want to write a call like L.someFunction(), you would import L in some way so that L was available as a symbol. But import will not exist for enums, since there is nothing to do, and so there is no need to import TestEnum in order to use TestEnum.Foo. However use TestEnum and use L would continue to be corresponding features.
Some thoughts. My main argument for "no" is that it would be annoying for library authors.
public module MyLibrary {
public module Component {}
private module Details {}
proc callApi() {
}
proc callApi2() {
}
}
If I was disciplined, I would put my import declarations in each callApi function depending on what was needed from the submodules. But if I have 100 of these calls, it's more effective to put them at module scope. Then, my submodule definitions effectively become:
module X {
import Y; module Y {
// ...
}
}
and the problem explodes from there for each descendant submodule.
In favor of "yes" is that the usage of modules would be more consistent, but I think it'll just be annoying / a usability issue.
Also minor note: renaming via import M as OtherModule would work whether import becomes a requirement or not. I don't see that as a valid argument for "yes".
Also minor note: renaming via import M as OtherModule would work whether import becomes a requirement or not. I don't see that as a valid argument for "yes".
If M is always available as in M.someFunction() then how can import M as OtherModule; prevent collisions with another variable/module called M? M will still refer to the submodule, it's just that now OtherModule is another name for M.
Vs. if the import is required, OtherModule would be the only name for it after that import M as OtherModule; .
Ah thanks. I didn't understand what argument you were making. That's a good point, though I agree with you that it's somewhat a weak argument as the authors of these conflicting modules are usually the same person/team.
As a point of clarification, it wasn't entirely clear to me in the OP what the "yes" and "no" positions referred to since the original question was phrased as "Should we do a, or b?" (and in fact, I think they are backwards from my assumption which is that "yes" would mean "we should do a").
Starting with the "no" arguments:
because presumably the module author is also the author of submodules
I don't know if I buy this argument. If a sub-module can be defined in another file as we hope, it seems just as likely that I would pull in a non-inline (sub)module that someone else had written as write it myself (?).
It would feel strange for the M using things in L to require a use but not for L using things in M to do so
Is that strange? Can't L see things from M simply through lexical scoping?
To me, the main argument in favor of "no" (assuming it means "you can't simply refer to the submodule") is that by inspecting the source code, it seems apparent that the symbol in question is visible and in scope, so it seems a little artificial not to be able to refer to it. I think of var x as introducing a new variable symbol 'x' that I can refer to if it's in scope; of proc foo() as introducing a new procedure symbol 'foo' that I can refer to if it's in scope; and (traditionally) of module M as introducing a new module symbol 'M' that I can refer to if it's in scope. So to choose the "yes" path makes module statements seem non-orthogonal to other symbol-declaration statements in this regard. (I think that this observation is similar to Engin's about orthogonality with enums above. The enum color { ... } statement gives me a new symbol named color that I can refer to directly if it's in scope without any additional work).
It works because M.chpl makes an implicit module M, which contains L and N. Therefore N can see L through its parent module. I think new Chapel programmers regularly don't understand the implicit file-level module.
I hate to say it, but it sounds like you still don't have this rule right. :) The compiler only inserts an implicit module in the case that a file contains module-level statements (e.g., any statements other than module declarations and comments) at file scope. Such statements have to belong to _some_ module, so the compiler treats the file as their module (which also supports conveniences like being able to write an entire program as file-scope code as in scripting languages). If the file only contains module declarations and comments at file scope, no implicit module is inserted. So code like your example above:
M.chpl
module L { }
module N { }
results in a module namespace hierarchy like this:
./ # global scope
L/
N/
whereas if you were to write:
M.chpl
writeln("Hi");
module L { }
module N { }
then you'd get the module hierarchy:
./ # global scope
M/
L/
N/
That said, I think your argument here probably still stands in that the reason your program wouldn't work if the modules were in separate files is that there's nothing in your definition of N or chpl command-line arguments suggesting that a module L is required and should be searched for. When they're in the same file, L is known simply by virtue of the fact that you asked the compiler to compile a module named L by handing it its source code (and doing so puts it into the global scope as illustrated above, so N can see it, as with any other top-level module).
[For historical context: The most traditional point of confusion w.r.t. implicit modules was that users coming from C tended to equate them with #include statements, so would put use statements at file scope like this:
use M;
module R {
writeln("In R");
}
and then be confused that R wasn't a top-level module. We put in a warning for this case based on user feedback similar to (and prior to) PR #6813 that Michael refers to above. I feel like the amount of confusion around implicit modules has dropped significantly since these changes; i.e., I'm not aware of cases where additional warnings like this would prevent confusion, though I could also imagine that any program that contains both top-level code and a module statement might be worthy of an informative note saying "the compiler inserted an implicit module M to represent your file-scope code for file M.chpl. Style-wise, I'd say that code like this really should include an explicit module rather than relying on a file-scope module, similar to the approach I believe we've taken for the modules/ directory.]
On the "yes" side, you say:
it would make use and import actually the only things that interact directly with module names
Maybe I'm not understanding what you mean by "interact directly with", but if I wrote:
use M except *;
M.foo();
it seems to me as though M.foo() is interacting with the module name (?).
As a point of clarification, it wasn't entirely clear to me in the OP what the "yes" and "no" positions referred to since the original question was phrased as "Should we do a, or b?" (and in fact, I think they are backwards from my assumption which is that "yes" would mean "we should do a").
Sorry that was confusing, "yes" and "no" were answering the question in the title. I've updated it to hopefully be clearer.
To me, the main argument in favor of "no" (assuming it means "you can't simply refer to the submodule")
Now I got confused by the parenthetical... But, I updated the "no" section with your argument.
I hate to say it, but it sounds like you still don't have this rule right. :)
Thanks for correcting me :) I think this part of the issue has more to do with #13523 and so I've moved further discussion of that example to https://github.com/chapel-lang/chapel/issues/13523#issuecomment-521199791 .
Maybe I'm not understanding what you mean by "interact directly with", but if I wrote:
use M except *; M.foo();it seems to me as though
M.foo()is interacting with the module name (?).
Right, but I'm saying the M in M.foo refers to a symbol that is created by the use statement. In other words, my viewpoint is that use creates local symbols for the imported things. (I know it is not implemented that way, but I think that's a reasonable conceptual viewpoint).
Note that if we did my proposal in https://github.com/chapel-lang/chapel/issues/13119#issuecomment-520984841 this code would have to be
import M;
M.foo();
since use would not actually create a local symbol called M.
Put another way, the reason that M.foo() works is that import M was present. This gives import M the opportunity to control the visibility of M if this code is in a module used by other code.
Along these lines, one reason it is appealing to require submodules be import/used has to do with module name collisions and was discussed above in https://github.com/chapel-lang/chapel/issues/13536#issuecomment-516934959 . The point is that if the submodules are automatically visible (the "no" case above) then programmers won't be able to resolve naming conflicts with e.g. import L as MyL. Because even if they did, that submodule will always also still be visible as L.
I was curious about what Python does here. As I understand it, in Python, you can (only) declare a submodule as something within a package. And, once you do that, the submodule is only visible within the parent module/package if there is an import:
spam/foo.py
def Foo():
print("In Foo")
spam/__init__.py
Foo.foo()
$ python
Python 3.7.3 (default, Apr 3 2019, 05:39:12)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import spam
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/mppf/pythonex/spam/__init__.py", line 1, in <module>
foo.Foo()
NameError: name 'foo' is not defined
This can be fixed by changing spam/__init__.py to either
import spam.foo
foo.Foo()
(using the fact that package paths are not relative by default) or
from . import foo
foo.Foo()
using an explicit relative import.
The Python example makes me wonder if there is a connection between the choice in this issue and whether or not imports are relative or absolute by default... but I don't see a tight connection between these two.
I'm saying the M in M.foo refers to a symbol that is created by the use statement. In other words, my viewpoint is that use creates local symbols for the imported things.
I've been mentally playing with this idea (as suggested by my latest comment on #13523) and am trying to get my head around what its implications would be. As a result, this comment is going to span a few of these related issues a bit (but I'm putting it here since this is where I first grokked Michael's idea). Starting with the case of sibling modules from #13523:
module M {
proc foo() { }
}
module N {
M.foo();
}
Traditionally, we've said that N could refer to M by virtue of the fact that the module M declaration is visible in its lexical scope. Here, I think you're saying that only use statements (and import statements if/when they come along) can refer to module X symbols via normal scoping rules, so M is essentially invisible to N until we add a use M; (arguably equivalent to use M as M;) within module N's declaration:
module M {
proc foo() { }
}
module N {
use M /* as M */;
M.foo();
}
at which point the M in M.foo() refers to the M introduced by the use statement which in turn can access the M in module M. This essentially gives us rule 1 in issue #13523. Conversely, leaving the use M; off makes the M.foo(); fail because we don't find any traditional symbols (records, classes, enums, procedures, etc.) named M in the lexical scope and can't refer to the M introduced by module M directly.
By analogy, this suggests that submodules could not refer to their siblings without a use:
module M {
module S1 {
proc foo();
}
module S2 {
S1.foo(); // requires a `use S1 /* as S1 */;` in order to be resolvable
}
}
(where today they can by similar rules: "looking up in my scope, I see module S1 so can refer to S1").
Presumably, we'd make a special case for a module's contents being able to refer to itself (?). That is, the following would be legal:
module M {
proc foo() { ... }
M.foo();
}
even though I haven't done a use M within module M? Maybe we could think of the declaration module M { ... } as implying an automatic self-use? That is, it's equivalent to:
module M {
use M /* as M */;
...
}
which also arguably rationalizes why a sub-module could refer to its parent module's symbols:
module M {
module S {
M.foo(); // OK due to implicit `use M;` within `module M`
}
proc foo() { ... }
}
I think this pattern suggests to me that using a sub-module's symbols would also require a use statement since the code could not refer to the sub-module's symbol without one. Thus:
module M {
// implicit `use M as M;` here
module S {
// implicit `use S as S;` here
proc foo() { ... }
}
S.foo(); // I can see `module S` but that isn't good enough to let me refer to `S`, so error
}
and:
module M {
// implicit `use M as M;` here
module S {
// implicit `use S as S;` here
proc foo() { ... }
}
use S;
S.foo(); // I can see the `S` introduced by `use S` which in turn refers to `module S` so this works
}
So that suggests to me taking the "yes" approach for consistency. And I do think that it helps with the "submodules may be defined in other files and be invisible-ish" problem (more on that in a sec).
Then again, as Bryant notes, it may get really annoying and really old fast.
If we went with this philosophy (nothing can refer to a module's symbols other than use/import) and I had to decide today, I'd choose "yes", knowing that I could relax to "no" later if/when I had more experience with it and was being driven out of my mind (which might be as soon as I had to update all of the existing tests?)
A somewhat middle ground solution (that might be too weird to argue passionately for) might be to say that the implicit use S that appears within any module declaration goes not right after the module S declaration itself, but after the outermost module declaration within the same file.
Thus, given:
M.chpl:
module M {
module S {
}
// auto-inject-files-from-the-magical-directory;
}
and N.chpl (which happens to be in the magical directory):
module N {
}
would turn into:
module M {
// implicit `use M as M;` -- put here since it's the topmost SW module in M.chpl where M was defined
// implicit use S as S except *; — put here since it's the topmost SW module in M.chpl where S was defined; the except `*` is necessary to avoid making all of S's contents directly available.
module S {
}
module N {
// implicit use N as N; — put here since it was the topmost SW module in N.chpl where N was defined
}
}
Thus, within M's scope, we could refer to symbols within M directly and to M and S, but we would have to use N in order to refer to N's symbols. This would also be a good way to prevent hijacking and surprises by magically using files from a directory whose contents you weren't really all that familiar with (as would the straight "yes" approach, obviously).
If we went with this philosophy (nothing can refer to a module's symbols other than
use/import) and I had to decide today, I'd choose "yes", knowing that I could relax to "no" later if/when I had more experience with it and was being driven out of my mind (which might be as soon as I had to update all of the existing tests?)
I'd be okay with this approach. I don't mind a more restrictive implementation that leaves room to relax later. Once you break submodules into separate files, the consistency makes sense, if only to make certain to the compiler which modules or submodules it should look to use. In fact...
Thus, within
M's scope, we could refer to symbols withinMdirectly and toMandS, but we would have touse Nin order to refer toN's symbols. This would also be a good way to prevent hijacking and surprises by magically using files from a directory whose contents you weren't really all that familiar with (as would the straight "yes" approach, obviously).
This observation is incredibly relevant to https://github.com/chapel-lang/chapel/issues/13524#issuecomment-522979924. If we "inject-known-dir", one way to write that statement (instead of include or inject) is to simply state module N; to denote that N exists and the compiler should look for it. But if the user is forced to still use N; to access N's symbols, how is that any different? This is the crux of the argument for one of the proposals in Rust's Revisiting modules, take 3 from my collection of links in https://github.com/chapel-lang/chapel/issues/12923#issuecomment-495453450. I highly recommend reviewing this blog post in its entirety, but for brevity, the relevant section is repeated here:
Modules
Mod statements would no longer be necessary to pick up a file [sic] a new file in the crate. Instead, rustc would walk the files it knows to walk (see next section for more info), and mount a module tree from that, possibly before parsing any Rust code.
Files mounted this way would have a
pub(crate)visibility, if you wish to change that publicity, add an export statement to their parent.pub export submodule1; pub(self) export submodule2; // if you are very concerned about using something // from this submodule elsewhere in the crateThough the names of modules are mounted automatically, they are not imported into their parent, and so they are not visible to relative paths from their parent unless they are imported with use or export. That is, you cannot use the name of a submodule without somhow bringing it into scope through use or export statements.
Modules of the form
mod foo { /* code */ }would still exist, with no change to their semantics. [for backwards compatibility]
@BryantLam - interesting point. Another way to say it is -
If we went with this philosophy (nothing can refer to a module's symbols other than use/import)
We are basically saying that use and import have a special capability to find other modules which possibly involve finding them in the filesystem. Given that, why would it be surprising for a module M that has a use L to look for L.chpl in M/L.chpl, and also in other module path locations (e.g. modules/standard/L.chpl etc)?
I view the examples brought up in #11262 in this comment as evidence that requiring a use for submodules is a good idea.
module M {
module M {
proc whatev() {
writeln("whee");
}
}
}
use M only M;
M.whatev(); // currently, this M refers to the outermost sub-module M above
I think it's confusing that the use M has no effect on what M.whatev() means.
I think it's confusing that the use M has no effect on what M.whatev() means.
But this behavior is consistent with that of the following code:
class Foo {
proc type whatev() {
writeln("In class Foo's whatev");
}
}
module Bar {
module Foo {
proc whatev() {
writeln("In module Foo's whatev");
}
}
}
use Bar only Foo;
Foo.whatev(); // prints "In class Foo's whatev"
An import Bar.Foo; would have the same impact. You're not going to solve this problem by requiring the submodule is used because the symbols defined at scope are still closer to Foo.whatev than the symbols brought in by the use or import statement.
I think what will really help is allowing us to rename modules when they are in a use or import statement
I think it's confusing that the use M has no effect on what M.whatev() means.
But this behavior is consistent with that of the following code:
I don't think that use or modules have to behave the same as classes. I don't find it surprising that your example would call the class method.
You're not going to solve this problem by requiring the submodule is used because the symbols defined at scope are still closer to
Foo.whatevthan the symbols brought in by theuseorimportstatement.
Suppose that we required submodules be used in the original example and we made use no longer expose the module name itself for qualified access (which is proposed in #13978). Then the example would work:
module M {
module M {
proc whatev() {
writeln("whee");
}
}
}
// M not in scope after the above because submodules must be `use`d
use M only M; // outer M not in scope after this because use does not bring in module symbol
M.whatev(); // here, M can refer only to the inner module
Even if we decided to go a different way on #13978, we could still get this example to work if the submodules had to be used. We would just need to control whether the module itself from a use or the module's contents brought in by theuse were considered nearer in scope. At the very least I would expect an ambiguity error if we did nothing else there.
I don't think that use or modules have to behave the same as classes. I don't find it surprising that your example would call the class method.
The point I'm trying to make is that there being consistency between modules and classes in this way is a way to explain away what you found confusing. Having them behave differently will cause confusion for people that think of classes as basically modules that you can make instances of (or modules as basically singleton classes, take your pick). I think that viewpoint is useful and something we should avoid breaking which is part of why I object to making module names be treated differently from other symbols defined in the same scope. To convince me otherwise, you're going to have to argue why that paradigm is not valuable.
A long time ago I proposed that modules and classes be more similar and mixable, so that you could say use a class or a module inside another class as a mix-in. Or new a module. But these ideas in practice were rejected pretty quickly as being too crazy / unrelated to Chapel's main goals of parallelism.
So in the current language, I don't think we have any obligation at all to make modules and classes behave the same. They don't behave the same now. You can't (and as far as I know, never will be able to) use a class at all; you can't inherit from a module in a class or a module; you can't new a module.
I do think that your argument is reasonable. I would rephrase it as this:
However, I don't agree with this argument. I think that the current situation is that the scoping rules for modules specifically is already too confusing, and that making the scoping for modules have fewer cases (e.g. module names are only made available by import) will reduce the level of confusion. I think this is a worthwhile trade, even if modules become less similar to classes in scoping rules.
However this is a judgement call (we are deciding - which is more confusing?) and I suspect we aren't going to convince each other at all.
For my part, I would be more convinced by your argument if you could use a class; or if the typical use-case for modules did not involve use (or in the future, import). But I don't see either of those things changing.
As the language stands now, if we continue to prefer to "keep module scoping like class scoping" rather than "simplify module scoping with use/import statements" then I think we are building the language around what is more of an unusual/corner case for modules - when there are submodules that aren't used/imported - rather than the common case - in which modules are in different files and are used/imported.
I've lost track... If we were to take a quick straw poll on this issue, where are people currently falling? Please give one of the following a thumbs-up.
Upvote this comment if you think that one shouldn't be able to refer to a submodule's name without a use of it first.
Upvote this comment if you think that one should be able to refer to a submodule's name if simply visible via lexical scoping / without first having a use of it.
Upvote this comment if you're still undecided.
I've started on a branch to implement this, just to see what kinds of impacts it has.
I voted for allowing to refer to a submodule's name when visible because it is consistent with what we do with other things, like variables and functions names.
Analogously, it should be legal for a submodule to refer to symbols in the enclosing scopes, to match the behavior of other kinds of nested scopes.
In the following code, I'd draw a parallel between x being a variable vs. a submodule. Another parallel is between the {...} being a nested scope of a conditional vs. that of a module.
var x = 1;
writeln(x); // no need to 'use x'
if ... {
writeln(x); // x comes from the enclosing scope
}
@vasslitvinov - I am curious if you'd apply the same argument to #13523, e.g. if the below is in one file:
module L {
proc lFunction() { }
}
module N {
proc main() {
L.lFunction(); // should this work?
}
}
then I would imagine the same argument would argue that "L is in scope". Put another way, the parallels-with-other-declarations argument does not apply for the above case, so why should it apply for submodules?
My view on this matter is that modules are fundamentally different from other symbols in terms of scoping (see https://github.com/chapel-lang/chapel/issues/13536#issuecomment-528850398 for some elaboration on that) and that it's more of a benefit to usability to limit the ways a module's symbol name is in scope than it is to make it similar to the other cases.
@mppf either I am not follow all the intricacies, or I am just outright not convinced.
In the example of two modules side by side:
module L { ... }
module N {
// is L visible here?
}
I propose to apply my parallel-to-var-decl principle -- so yes, L should be visible from within N. (Still, we need to use L to reference L's symbols without qualification.)
To me this offers clean and simple semantics.
Remark: in this setup L and N are always submodules, either of an explicit enclosing module (if present) or of an implicit one (since they are in the same file).
Remark: inability to use a class is not a strong argument. We already allow use of an enum, to everyone's joy. If we see an important scenario where use of a class or a record helps, I think we will allow that as well.
Remark: in this setup L and N are always submodules, either of an explicit enclosing module (if present) or of an implicit one (since they are in the same file).
I believe this is only correct if there are other symbols at the top level besides module definitions (I played with this recently for other reasons). When only modules are defined at the top level, there is no overarching file module scope inserted.
I propose to apply my parallel-to-var-decl principle -- so yes, L should be visible from within N. (Still, we need to use L to reference L's symbols without qualification.)
Vass, I think Michael's point is that his example is precisely what we decided to stop supporting in issue #13523 and PR #13930 because it was too fragile and fraught with confusion (note that Michael's example only uses top-level modules, not sub-modules). So then the natural question is "given the decision #13523, why should sub-modules be different?" (I think there could be an argument there for why they should be different, but it may need to be something different than "you can see the module name looking upwards in its lexical scope" since that rule arguably applies to top-level modules as well).
I believe this is only correct if there are other symbols at the top level besides module definitions (I played with this recently for other reasons). When only modules are defined at the top level, there is no overarching file module scope inserted.
Lydia's correct in this. I missed that you assumed that there was a parent module to L and N.
I did not realize there was no implicit module above L+N, as Lydia describes. At least I am upholding my view for submodules.
For top-level L and N, the case when they are in separate files is different IMO than when they are in the same file. I suspect much of motivation in #13523 does not apply when the modules are in the same file. In any case, we are not revisiting #13523 here.
So, I will rephrase the question "why should sub-modules be different?" to "why should top-level modules be different?" My answer being "because top-level modules can be in different files".
Maybe when sub-modules start being in different files, maybe they, too, will need to be use-d or import-ed when they are not in the same file. That remains to be seen.
Maybe when sub-modules start being in different files, maybe they, too, will need to be use-d or import-ed when they are not in the same file.
In fact that is part of the reason that this issue exists. It was in some ways spun off of #13524 which proposes submodules in different files. I think we should make the decision about this issue knowing that we expect to support submodules in different files.
That remains to be seen.
I'm not sure what more information we'd need around the submodules-in-different-files proposal to answer this question (the main part, IMO, is that such functionality is likely to be added). What information would you want to know about submodules-in-different-files to inform this issue specifically?
I think you are proposing that top-level modules / sub-modules in different files have different behavior than if they are in a single file. I do not think that is a good idea, since moving a module/submodule out to a different file shouldn't change the way the program behaves (it should be possible to do simply as a code reorganization).
I think you are proposing that top-level modules / sub-modules in different
files have different behavior than if they are in a single file. I do not think
that is a good idea, since moving a module/submodule out to a different file
shouldn't change the way the program behaves (it should be possible to do
simply as a code reorganization).
I agree with this and almost said so yesterday before waiting to see how the
conversation went on its own. I think that a pile of code should generally behave
the same regardless of its organization into files. That said, I'm also open to
exceptions to that rule for convenience, such as the use of a filename as an implicit
module name when a file contains file-scope code other than module declarations.
To motivate my approach, think of use MyModule only; as the request "go look in another file if MyModule is not already visible". If MyModule is already visible using normal scoping rules, that's a no-op.
Likewise, if a submodule is already visible, use-ing it is a no-op.
moving a module/submodule out to a different file shouldn't change the way the program behaves
Sure. However adding a little use MySubmodule; when cutting and pasting the actual text seems benign to me. Some of the proposals in #13524 include such a requirement already, just using different syntax. Even for a top-level module, requiring to add use MyModule; when moving parts of a program to another file is a fair game.
When I see something like:
var x ...;
module Y {...}
it does not make sense that I have to use one and not the other. It makes sense to add an adjustment when splitting these into separate files. This is fundamentally where I come from.
You lose me at "in another file" because I don't think language semantics should be so deeply related to the code's organization into files.
it does not make sense that I have to use one and not the other.
I think it does make sense and wouldn't seem as surprising if we'd been living with these rules for years. I think it takes some getting used to given our history with the language because it fundamentally says "module symbols are treated differently than other symbols." But I think that that's OK.
The bug report in #11868 is related. Of course we could seek to solve it another way, but requiring a use/import of the submodule would fix the bug. It's not the strongest reason but it does suggest that requiring a use for submodules might help the implementation effort.
Note, we don't expect to give the same treatment to parent modules, so parent modules won't need to be used/imported, so the "import is the only thing interacting with module names" Pro does not apply.
PR #15278 takes a quick stab at this if anyone wants to see the impact, think about it, weigh the tradeoffs, etc. I haven't put in the effort to get all tests working, but have gotten most that aren't specifically wrestling with corner cases of visibility and scoping working.
I'm currently leaning against requiring submodules to be used/imported in order to access their names. It feels unnecessarily laborious, and there will always be something within the parent module to indicate the submodule's name and that it is there (either its declaration or a statement that brings it in from a file). So adding the requirement that it also be used/imported feels like busywork to me at present.
Today we effectively decided not to pursue this.
Most helpful comment
Upvote this comment if you think that one shouldn't be able to refer to a submodule's name without a
useof it first.