Julia: relative using/import should search current directory

Created on 21 Oct 2013  Â·  63Comments  Â·  Source: JuliaLang/julia

When I do

# Main.jl
module Main
  using .Foo
end

and there's a file called Foo.jl in the same directory as Main.jl it should be loaded. I suspect that relative using should also _not_ look in the global require places – i.e. Pkg.dir() and then LOAD_PATH. The same applies to import.

design help wanted modules

Most helpful comment

Based on some in-person brainstorming yesterday, we (mostly with @malmaud, @JeffBezanson) came up with the following scheme. It doesn't only apply to relative using/import, but also doesn't apply to all relative using/import, so it's a bit cross-cutting to what this issue was original about. However, the end-point is quite similar to when we introduced the top-level code via the LOAD_PATH in effect: it allows one to have code following a certain convention to omit include calls and continue to work the same way, with the includes being implied by using/import statements.

If the module in an import resolves to a name that does not already exist and we are currently in the process of loading a prefix of that module, then if files with appropriate names exist, they will be included, and if that provides the desired modules, they be used. If we are loading A for example, and encounter using .B or equivalently using A.B within the definition of the A module, then we will look for B.jl relative to the location of the source path of A in two places:

  1. If joinpath(dirname(A_path), "B.jl")exists, load it; otherwise
  2. If joinpath(dirname(A_path), "B", "B.jl") exists, load it; otherwise
  3. Raise error that B could not be found.

    The premise is that a module is either provided by a single file of that name or a directory of that name with the file by that name as an entry-point. This implies a stack of paths to what one is currently loading, and you have to look through the whole stack for the innermost file you are currently loading which is a prefix of what you want to load. Some examples should help clarify:

  • While loading A from src/A.jl and B from src/B.jl:

    • find A.B.C as src/C.jl or src/C/C.jl

  • While loading A from src/A.jl and B from src/B/B.jl:

    • find A.B.C as src/B/C.jl or src/B/C/C.jl

    • find A.D as src/D.jl or src/D/D.jl.

One will note that an absolute import like using A.B inside of module A can trigger this behavior; meanwhile a relative import like using ..B inside of top-level module A will not trigger this behavior, so this relative names are not really the significant feature here.

Another thing to observe is that there are many potential file hierarchies for a given module hierarchy. On one hand, that could be a bit confusing, but on the other hand, forcing a deep hierarchy when everything can easily be contained in a few top-level files is quite annoying.

All 63 comments

I agree it should not look in the global places, but perhaps it should not look at files at all. This doesn't seem to extend to multiple dots, e.g. using ..Foo looking in the parent directory.

Oops.

I would love to see a solution to this problem... But Jeff has a point.

Would be a little weird, but what if using takes a second, optional parameter

using Foo "../../extra-src/"

+1 for this.

I'm not sure the relative path thing is a problem. You could, for example, have something like

# Foo.jl
module Foo
  using ..Bar
  using .Baz
end
# Bar.jl
module Bar
  # barfy stuff
end
# Foo/Baz.jl
module Baz
  # bazish stuff
end

That would allow relative imports of sibling modules to automatically work. Sure, it's strange if you do more dots than you're nested into modules, but just don't do that.

Why just not allow to use using with a string?
Such as: using "Baz/Baz.jl".

Because loading from a file is only a side effect of using when no such module exists. The normal case is that the module already exists. If you allow a string, you still have to map those strings to a module.

But if the string, which is a path, qualifies a module, which it does due to its structure?

Ok. Let me try to get this straight.

I do not have an opinion about this "import/include/using" oddity.

Don't you think that sooner or later, the python import strategy turns out to be best?
You guys enabled to have a __init__ function within a module which is automatically called, but not at first, which is quite strange, since I expected it to be called first, just like a BEGIN block in Perl.

I ran into exactly this problem, that I arranged my code in a file directory manner and tried to push! the load-lib path inside the init function. But it is not evaluated at first, which was quite confusing
to me, since I expected it to act such as a CTor or something.

Let's just presume I know what using does semantically. Why not use using to publish several declarations inside the current namespace but give it the ability to assign some prefix for it?

I really liked the import idea of Java. Where dots marked a directory.

For instance:

using Baz.baz

or

using Baz.baz as bz

Does this make sense to you?

Cheers

Stefan

It's a little frustrating to have #12695 merged in 0.4 but this slated for 0.5... I feel like it's going to bite people in 0.4 if there is no way to load modules from the current directory short of modifying the load path.

We seem to have been getting by pretty well without that --- outside of the REPL, loading from the CWD is just a bug, and I doubt any packages depended on it. In the REPL, include is probably sufficient.

I used to structure my code into submodules, with each file representing a module ; back when the CWD was included in the load path, this allowed me to use for instance using Utils to load types and functions exported from Utils.jl. I can now replace this with include("Utils.jl"); using .Utils; however, this is inconvenient e.g. if Utils defines types, because creating this type from module A would create an A.Utils.Type instead of a Utils.Type. What is the recommended way of organizing Julia code (with common functions and types) into subfiles ? Should I add the current directory to the path anyway to use the convenience of modules ? Thanks.

I have hit the same problem as @bermanmaxim FWIW, and I've moved to just includeing everything instead

Thanks @IainNZ. Using includes seems indeed to be the standard way now. I guess it's the job of the main file of a module to include everything in the right order to make the subparts work (defining types before functions...) Using distinct modules had the advantage of making the dependencies of each file somewhat more explicit, e.g. putting utils.helperfunction to make clear that the function comes from utils, and not risking including things twice.

You can use includes in the main file and still structure your code into submodules. That's what I do in Debug.jl.

@bermanmaxim Not sure I understand the problem, everything seems to work how I'd expect:

module Parent
    export ParentT, ChildT

    module Child
        export ChildT
        type ChildT
        end
    end
    using .Child

    type ParentT
    end

end

module Test
    using Parent

    f(::ParentT)="parent"
    f(::ChildT)="child"
end

Test.f(Parent.ParentT()) # "parent"
Test.f(Parent.ChildT()) # "child"

Oh I see, it's this that's problematic:

A.jl:

module A
type Atype
end
end

B.jl:

module B
include("A.jl")
import .A: Atype
end

MyPkg.jl:

module MyPkg
include("A.jl")
include("B.jl")
end

MyPkg.B.Atype()  # MyPkg.B.A.Atype
MyPkg.A.Atype()  # MyPkg.A.Atype
end

You might hope to get around this by only includeing from the parent module:

module MyPkg
include("A.jl")
include("B.jl")
end

where
B.jl is now just

module B
import .A: Atype
end

so now

MyPkg.B.Atype() # MyPkg.A.Atype

as you want, _but_ you're back to being reliant on the package entrypoint to manually take into account submodule dependencies:

MyPkg.jl:

module MyPkg
include("B.jl")
include("A.jl")

won't work.

Thanks @malmaud, I have since followed @toivoh's advice and develop code in a structure similar to Debug.jl.

The loss of having the current path in the LOAD_PATH list is distressing to me. I find myself having to add a statement of the form,

push!(LOAD_PATH, pwd())

To all my high level scripts in order to get anything to work. In particular the solution,

include("A.jl")
using A

Does not appear to work because it does not nest properly, ie
file B.jl

module B
include("A.jl")
using A
func2() = ("function 2", func1())

end

file A.jl

module A
export func1
func1() = "func 1"


end

If you try to include("B.jl") you get the error:
ERROR: LoadError: ArgumentError: A not found in path
in require at ./loading.jl:233
in include at ./boot.jl:261
in include_from_node1 at ./loading.jl:304
while loading /data/Projects/Energous/B.jl, in expression starting on line 3

However in a julia prompt you can type the contents of B.jl line by line without error if you don't include the module definition.

You might reasonably want to use module A without module B. However if B requires A you will be unable to use B unless you first include A at the highest level. So this means that you have to remember to include the text of every dependent module before you can use it, if they are all in the same working directory.

For my use case this FORCES me to explicitly add the current directory to the path for every script I run in my working directory.

@mattcbro, you should do

include("A.jl")
using .A

to tell it you want the locally defined A module, not a global A module. This will work and does not require you to modify the LOAD_PATH.

using A is potentially wrong anyway because it could get confused if there is another module called A defined in the load path. So, your experience is actually an argument in _favor_ of the current behavior, because it caught a bug that you otherwise might not have noticed.

That being said, I still tend to agree with @StefanKarpinski that using .A (_not_ using A) should look for A.jl in the current directory; it's annoying to have to manually include(A.jl), though it's not a huge deal.

(If A.jl is in some other directory, of course, then you need the manual include.)

Is there a technical problem with have using .A search the current directory, or is just a design decision at this point? I would definitely favor having that behavior.

At this point I think it's just a design issue. The fact that you can load code from a parent directory with multiple leading dots is kind of strange. To me there's also the question of whether using .B occurring in module A should load B.jl in the current directory or load A/B.jl. The former would tend to keep directory structures pretty flat, which may be a good thing, while the latter would tend to make them more nested. While I generally favor flatter directory structures (consider how ridiculous Java project file trees are), this would seem to tend to put everything in the top-level directory:

# A.jl
module A
    using .B
    using .C
end
# B.jl
module B
    using .D
    using .E
end

And so on – all of A.jl, B.jl, C.jl, D.jl and E.jl would be in the top-level directory, even though it seems like maybe B and C belong in an A directory and maybe D and E belong in a B directory. Moreover, if you _want_ to have a nested directory structure, how would you even express that?

It looks like it would be pretty easy to implement: modify the eval_import_path_ function in src/toplevel.c to add an else if (m == jl_current_module) clause after the if (m == jl_main_module), which looks for a var.jl file in the current directory.

@StefanKarpinski, I thought that the proposal was that using .B would look in the directory of the file that the using statement occurs in (or pwd in the REPL). That's what most people would think of as the "current" directory, and is the same as the directory used for include("B.jl").

If you _want_ a nested directory, or any other directory structure, you would just do include("B/B.jl"); using .B manually as you do now. Doing using .B would only look for a B.jl file _if_ B were not already defined.

That was my original proposal, but I'm wondering how one would introduce a nested folder structure using this mechanism? It seems to me that there wouldn't be any way to do it. One option would be to have module A; using .A.B; end be special syntax that loads for "A/B.jl". That would allow having parts of A defined in a directory. Maybe I'm overthinking this.

@StefanKarpinski, you would introduce a nested folder structure by doing include("B/B.jl"); using .B manually as now; see above. (I edited my post after replying, so maybe you didn't see the 2nd paragraph.)

@stevengj OK that works thanks. However please help me understand. What is the preferred use paradigm for creating and using local modules. Do we really have to have both an include() statement along with a using or import statement?

Perhaps the idea is to have a master script that has all of your includes in them? How do you folks do this? I notice that one person simply uses includes instead of using or imports for their local work.

@mattcbro, yes, you currently need both include and using. You don't need import (doing include effectively also does import).

I mostly just use include and don't bother with submodules. The only reason to use submodules is if you want to segregate your namespace, but in that case I normally don't want to do using (I just want import and qualified names). For example, in the PETSc.jl module we are using a PETSc.C module for the raw wrappers around the low-level C interface to keep these zillions of functions from polluting the PETSc namespace, but then we use the fully qualified names, e.g. we do C.foo(...) to call the foo function. Hence the C module has no exports and we don't need using C.

Still having to use include seems to defeat a lot of the purpose of this change.

Would having to have the include also mean you have separate copies of the sub module, instead of a single (possibly pre-compiled) one? If so, that seems like the biggest drawback to me, not the extra typing required.

No, you would only have one copy. The purpose would be to save typing the redundant include if you do using .Foo in the common case where Foo.jl is in the same directory. Saving on typing is the only question in this whole thread — there has been and will be no change in functionality.

Saving typing isn't the only issue to me – I'd like to get to a point where you don't need to use include in normal code. To that end, I'd like for relative using to be _the_ way to decompose a module into files and directories. But maybe we as a project don't want that. We should have a conversation about it that doesn't include lots of ill-informed handwringing by people who've barely used Julia about "modularity".

@StefanKarpinski, I like being able to split a long file into pieces without needing to create a submodule (which forces me to either export things or use qualified names).

Ok, maybe we need some other modularity mechanism then. E.g. something where each file gets its own scope and it's less likely for unexported globals to collide across files.

Well, I don't think anyone's talking about _removing_ include from the language. But it seems bad for it be a _requirement_ for creating hierarchical packages where each file defines a module, which is a style a lot of people seem to like.

Ok, I agree with that. But @stevengj's point is valid that having a separate module for each file is annoying because of exporting and importing, etc. One idea that was raised in a conversation I had at JuliaCon was that submodules would behave more like nested global scopes instead of independent scopes – i.e. this:

module A
    x = 1
    module B
        # x is visible here
        y = x + 1
    end
    # y is not visible here though
end

This would remove a lot of the annoyance of splitting things into submodules.

I like that, but what if a subfile sets what it thinks is a global variable with a line like X=1, but actually X was previously defined in the including file and has now been clobbered in that outer scope? You would need to get in the habit of defining submodule globals with 'local' or whatever the equivalent keyword would end up being.

Presumably it would work like other scopes and doing X = 1 in the submodule creates a new X binding local to that file.

Maybe I'm confused, but wouldn't it work the same way this works now:

function f()
  x=1
  let 
    x=2
  end
  x
end

julia> f()
2

I was thinking about the "scope gap" between global and function scope:

julia> x = 1
1

julia> function f()
           x = 2
       end
f (generic function with 1 method)

julia> f()
2

julia> x
1

Ah, right. +1 from me on having submodules have that kind of scoping semantics.

Yes, that is more like what I had expected to happen when I first started using Julia.
:+1: to @stefankarpinski's idea

A related question is how to handle defining methods on generic functions defined in the parent module. This won't work right now, but maybe it should?

module A
function f end

module B
f(::Int)=1
end

module C
f(::Float64)=2
end

f(1) 

end

@stevengj If I just use includes, does precompiling work? I've just started messing around with precompiling and I'm trying to figure out the best work flow. (Nice addition by the way).

Most the examples I've looked at, had the precompile() statement associated with a module.

If it helps to understand, include does not nothing more than to instruct Julia to copy/paste the code in the file into where the include statement is at runtime. Everything behaves exactly as if you had just copied the code from the file yourself to where the include statement is.

Yes, precompiling works with include.

I am just starting to learn Julia, and I quickly hit this issue due to the misleading instructions in Workflow Tips.

I gather from the discussion in this thread that the instructions should read include("./Tmp.jl") rather than import Tmp. Is that correct?

@meowklaski, no need for the ./. But yes, if you have a module Tmp in Tmp.jl in the current directory, then include("Tmp.jl") will also import it. You can also do using Tmp after running include("Tmp.jl") if you want to import the exported names from Tmp.

Based on some in-person brainstorming yesterday, we (mostly with @malmaud, @JeffBezanson) came up with the following scheme. It doesn't only apply to relative using/import, but also doesn't apply to all relative using/import, so it's a bit cross-cutting to what this issue was original about. However, the end-point is quite similar to when we introduced the top-level code via the LOAD_PATH in effect: it allows one to have code following a certain convention to omit include calls and continue to work the same way, with the includes being implied by using/import statements.

If the module in an import resolves to a name that does not already exist and we are currently in the process of loading a prefix of that module, then if files with appropriate names exist, they will be included, and if that provides the desired modules, they be used. If we are loading A for example, and encounter using .B or equivalently using A.B within the definition of the A module, then we will look for B.jl relative to the location of the source path of A in two places:

  1. If joinpath(dirname(A_path), "B.jl")exists, load it; otherwise
  2. If joinpath(dirname(A_path), "B", "B.jl") exists, load it; otherwise
  3. Raise error that B could not be found.

    The premise is that a module is either provided by a single file of that name or a directory of that name with the file by that name as an entry-point. This implies a stack of paths to what one is currently loading, and you have to look through the whole stack for the innermost file you are currently loading which is a prefix of what you want to load. Some examples should help clarify:

  • While loading A from src/A.jl and B from src/B.jl:

    • find A.B.C as src/C.jl or src/C/C.jl

  • While loading A from src/A.jl and B from src/B/B.jl:

    • find A.B.C as src/B/C.jl or src/B/C/C.jl

    • find A.D as src/D.jl or src/D/D.jl.

One will note that an absolute import like using A.B inside of module A can trigger this behavior; meanwhile a relative import like using ..B inside of top-level module A will not trigger this behavior, so this relative names are not really the significant feature here.

Another thing to observe is that there are many potential file hierarchies for a given module hierarchy. On one hand, that could be a bit confusing, but on the other hand, forcing a deep hierarchy when everything can easily be contained in a few top-level files is quite annoying.

As far as I can tell, this is a non-breaking change so it could be moved off the 1.0 milestone in a pinch. It would be very nice to have for 1.0 however, so I'll leave it here for now.

Resolved: we don't have time for this now and it's a non-breaking feature.

IMO the documentation on modules needs more clarity. It should explain how one normally splits a project into multiple files. Currently there's only a very brief section on "modules and files" and it doesn't explain the issue well at all. I didn't know the correct way to proceed in my project (first include, then using) until I found this issue via Google. The documentation has been excellent but it seems possible for this section to do better. Any real project needs to be split into multiple files and I believe many people would be wondering about this.

@x-ji Agreed, and I think when this issue is solved, people will be able to use using/import only? since local modules can be loaded directly through them.

@x-ji, the manual does explain how one normally splits a project.
Most modules need to be split into multiple files, but not multiple submodules. You can just have a single module that includes multiple files. That's why there are no using statements or submodules in that section of the manual.

@stevengj Could you point out where in the documentation https://docs.julialang.org/en/v1/manual/modules/ is it stated how to normally split a project? I don't think it's clearly explained at all, certainly nothing about your suggestion of using only includes. The documentation begins with an example about import, using and export, and only much later does it mention the concept of include. In no way does it make it clear that one is expected to use include as the default way to organize a project. Or are you talking about a completely different section of the manual?

The way you're suggesting, that for most projects one would just define one module and includes all the other files in that one module (while paying attention to the include order and avoid circular dependencies), and completely ignore the using and import mechanisms, is just simply unintuitive for people who are used to most other languages. This also makes people feel a bit uneasy about code maintenance in large projects. OK I can get it if this is "the Julia way". But at least please state it clearly in the documentation.

Also after carefully reading through this thread I can finally get that currently there might be different ways to approaching project organization, one is to have no submodules at all and use includes only, another is to include and then using submodules. However the documentation is unclear on any of them, which IMO is unfriendly to newcomers. It can point out different approaches and give some examples.

@x-ji I share your concerns about how Julia projects need to be organised differently from some other popular languages.

For me, having to manually "glue" together files in a package using include() in the package entry script was the most confusing part. In some other languages such as Python and Java, there exists a default convention that maps the names of the modules to the files that contain their definitions on the file system, so when a module is imported, the language runtime automatically knows where to find them. For Julia, my understanding is that this mapping exists for packages but not for modules, so the mapping has to be managed manually using all the include() statements.

In terms of the best practices for organising multi-file, multi-module Julia projects, I find the structures and strategies used by the Yao.jl package to be very sensible.

Thanks @zhangxiubo for mentioning our package. I think when this issue is solved (as @StefanKarpinski proposed), we will be able to load files/modules locally without using include to organize them manually.

@stevengj I think what @x-ji want is to let the compiler itself to find the module and load them, which is just what is proposed in this issue. I was concerned about this problem once, since I was using Python, C/C++, I prefer to write all my dependencies of current script before I start implementing things. This will help those who is trying to read your code get to know what you are doing.

There was a debate in discourse about whether we should use include for organizing files, or this should be solved by the compiler itself.

https://discourse.julialang.org/t/what-is-the-preferred-way-to-manage-multiple-files/8969

I think the reason we have to write

#ifndef MAIN_H
#define MAIN_H

// code

#endif // MAIN_H

for C/C++ each time, is just because the compiler cannot handle file dependencies itself. I don't want Julia to inherit this feature as well... The proposed way of loading modules is quite similar to rust to me.


But unfortunately, at the moment (v1.0.0) if you want to let the compiler itself solve the file dependencies (without include), which means in each file

# A.jl
include("B.jl")
include("C.jl")
# B.jl
include("D/D.jl")
# C.jl
include ("D/D.jl")

will cause an error... This syntax may have the following disadvantages:

  • readability: it can be hard to know the file dependencies from a single file by others (You will have to find the dependencies in the upper most file, which includes everything)
  • stability: it may cause unexpected error while developing in a team, when the source code is not organized well manually.
  • hard-to-solve-order: when a file has multiple dependencies, it will be hard to solve the include order

@x-ji In my own experience, organizing files with include in Julia at the moment, should usually follow a tree structure, which will make the dependencies more linear. And try not to use deep hierarchy of modules. Most Julia project will just use only one module, and include linearly (but the include may have an order).

Bump.

When will the originally issue be fixed? It should be very easy. It causes a lot of trouble for me when structuring my code. The workaround of using include and using does not always work (see: https://github.com/julia-vscode/julia-vscode/issues/807 ).

When someone gets around to it. I've thought about taking a crack at it several times in the past few months but haven't quite found the time. If someone else wants to give it a try, I agree that it shouldn't be that difficult.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sbromberger picture sbromberger  Â·  3Comments

StefanKarpinski picture StefanKarpinski  Â·  3Comments

omus picture omus  Â·  3Comments

Keno picture Keno  Â·  3Comments

wilburtownsend picture wilburtownsend  Â·  3Comments