Julia: relative using/import should search current directory

Created on 21 Oct 2013 · 63Comments · Source: JuliaLang/julia

When I do

# Main.jl
module Main
  using .Foo
end

and there's a file called Foo.jl in the same directory as Main.jl it should be loaded. I suspect that relative using should also _not_ look in the global require places – i.e. Pkg.dir() and then LOAD_PATH. The same applies to import.

design help wanted modules

Source

StefanKarpinski

👍10

Most helpful comment

Based on some in-person brainstorming yesterday, we (mostly with @malmaud, @JeffBezanson) came up with the following scheme. It doesn't only apply to relative using/import, but also doesn't apply to all relative using/import, so it's a bit cross-cutting to what this issue was original about. However, the end-point is quite similar to when we introduced the top-level code via the LOAD_PATH in effect: it allows one to have code following a certain convention to omit include calls and continue to work the same way, with the includes being implied by using/import statements.

If the module in an import resolves to a name that does not already exist and we are currently in the process of loading a prefix of that module, then if files with appropriate names exist, they will be included, and if that provides the desired modules, they be used. If we are loading A for example, and encounter using .B or equivalently using A.B within the definition of the A module, then we will look for B.jl relative to the location of the source path of A in two places:

If joinpath(dirname(A_path), "B.jl")exists, load it; otherwise
If joinpath(dirname(A_path), "B", "B.jl") exists, load it; otherwise
Raise error that B could not be found.

The premise is that a module is either provided by a single file of that name or a directory of that name with the file by that name as an entry-point. This implies a stack of paths to what one is currently loading, and you have to look through the whole stack for the innermost file you are currently loading which is a prefix of what you want to load. Some examples should help clarify:

While loading A from src/A.jl and B from src/B.jl:
- find A.B.C as src/C.jl or src/C/C.jl
While loading A from src/A.jl and B from src/B/B.jl:
- find A.B.C as src/B/C.jl or src/B/C/C.jl
- find A.D as src/D.jl or src/D/D.jl.

One will note that an absolute import like using A.B inside of module A can trigger this behavior; meanwhile a relative import like using ..B inside of top-level module A will not trigger this behavior, so this relative names are not really the significant feature here.

Another thing to observe is that there are many potential file hierarchies for a given module hierarchy. On one hand, that could be a bit confusing, but on the other hand, forcing a deep hierarchy when everything can easily be contained in a few top-level files is quite annoying.

StefanKarpinski on 6 Sep 2017

👍12

All 63 comments

I agree it should not look in the global places, but perhaps it should not look at files at all. This doesn't seem to extend to multiple dots, e.g. using ..Foo looking in the parent directory.

JeffBezanson on 21 Oct 2013

Oops.

StefanKarpinski on 21 Oct 2013

I would love to see a solution to this problem... But Jeff has a point.

Would be a little weird, but what if using takes a second, optional parameter

using Foo "../../extra-src/"

WestleyArgentum on 21 Oct 2013

+1 for this.

tknopp on 15 Aug 2014

I'm not sure the relative path thing is a problem. You could, for example, have something like

# Foo.jl
module Foo
  using ..Bar
  using .Baz
end

# Bar.jl
module Bar
  # barfy stuff
end

# Foo/Baz.jl
module Baz
  # bazish stuff
end

That would allow relative imports of sibling modules to automatically work. Sure, it's strange if you do more dots than you're nested into modules, but just don't do that.

StefanKarpinski on 15 Aug 2014

👍1

Why just not allow to use using with a string?
Such as: using "Baz/Baz.jl".

fxbrain on 15 Aug 2014

Because loading from a file is only a side effect of using when no such module exists. The normal case is that the module already exists. If you allow a string, you still have to map those strings to a module.

StefanKarpinski on 15 Aug 2014

But if the string, which is a path, qualifies a module, which it does due to its structure?

fxbrain on 15 Aug 2014

Ok. Let me try to get this straight.

I do not have an opinion about this "import/include/using" oddity.

Don't you think that sooner or later, the python import strategy turns out to be best?
You guys enabled to have a __init__ function within a module which is automatically called, but not at first, which is quite strange, since I expected it to be called first, just like a BEGIN block in Perl.

I ran into exactly this problem, that I arranged my code in a file directory manner and tried to push! the load-lib path inside the init function. But it is not evaluated at first, which was quite confusing
to me, since I expected it to act such as a CTor or something.

Let's just presume I know what using does semantically. Why not use using to publish several declarations inside the current namespace but give it the ability to assign some prefix for it?

I really liked the import idea of Java. Where dots marked a directory.

For instance:

using Baz.baz

using Baz.baz as bz

Does this make sense to you?

Cheers

Stefan

fxbrain on 16 Aug 2014

Bump. Milestoning. https://github.com/JuliaLang/julia/issues/9079#issuecomment-127274743

tkelman on 11 Aug 2015

It's a little frustrating to have #12695 merged in 0.4 but this slated for 0.5... I feel like it's going to bite people in 0.4 if there is no way to load modules from the current directory short of modifying the load path.

stevengj on 21 Aug 2015

We seem to have been getting by pretty well without that --- outside of the REPL, loading from the CWD is just a bug, and I doubt any packages depended on it. In the REPL, include is probably sufficient.

JeffBezanson on 21 Aug 2015

I used to structure my code into submodules, with each file representing a module ; back when the CWD was included in the load path, this allowed me to use for instance using Utils to load types and functions exported from Utils.jl. I can now replace this with include("Utils.jl"); using .Utils; however, this is inconvenient e.g. if Utils defines types, because creating this type from module A would create an A.Utils.Type instead of a Utils.Type. What is the recommended way of organizing Julia code (with common functions and types) into subfiles ? Should I add the current directory to the path anyway to use the convenience of modules ? Thanks.

bermanmaxim on 3 Sep 2015

👍2

I have hit the same problem as @bermanmaxim FWIW, and I've moved to just includeing everything instead

IainNZ on 3 Sep 2015

Thanks @IainNZ. Using includes seems indeed to be the standard way now. I guess it's the job of the main file of a module to include everything in the right order to make the subparts work (defining types before functions...) Using distinct modules had the advantage of making the dependencies of each file somewhat more explicit, e.g. putting utils.helperfunction to make clear that the function comes from utils, and not risking including things twice.

bermanmaxim on 4 Sep 2015

You can use includes in the main file and still structure your code into submodules. That's what I do in Debug.jl.

toivoh on 6 Sep 2015

@bermanmaxim Not sure I understand the problem, everything seems to work how I'd expect:

module Parent
    export ParentT, ChildT

    module Child
        export ChildT
        type ChildT
        end
    end
    using .Child

    type ParentT
    end

end

module Test
    using Parent

    f(::ParentT)="parent"
    f(::ChildT)="child"
end

Test.f(Parent.ParentT()) # "parent"
Test.f(Parent.ChildT()) # "child"

malmaud on 25 Oct 2015

Oh I see, it's this that's problematic:

A.jl:

module A
type Atype
end
end

B.jl:

module B
include("A.jl")
import .A: Atype
end

MyPkg.jl:

module MyPkg
include("A.jl")
include("B.jl")
end

MyPkg.B.Atype()  # MyPkg.B.A.Atype
MyPkg.A.Atype()  # MyPkg.A.Atype
end

You might hope to get around this by only includeing from the parent module:

module MyPkg
include("A.jl")
include("B.jl")
end

where
B.jl is now just

module B
import .A: Atype
end

so now

MyPkg.B.Atype() # MyPkg.A.Atype

as you want, _but_ you're back to being reliant on the package entrypoint to manually take into account submodule dependencies:

MyPkg.jl:

module MyPkg
include("B.jl")
include("A.jl")

won't work.

malmaud on 25 Oct 2015

Thanks @malmaud, I have since followed @toivoh's advice and develop code in a structure similar to Debug.jl.

bermanmaxim on 26 Oct 2015

The loss of having the current path in the LOAD_PATH list is distressing to me. I find myself having to add a statement of the form,

push!(LOAD_PATH, pwd())

To all my high level scripts in order to get anything to work. In particular the solution,

include("A.jl")
using A

Does not appear to work because it does not nest properly, ie
file B.jl

module B
include("A.jl")
using A
func2() = ("function 2", func1())

end

file A.jl

module A
export func1
func1() = "func 1"


end

If you try to include("B.jl") you get the error:
ERROR: LoadError: ArgumentError: A not found in path
in require at ./loading.jl:233
in include at ./boot.jl:261
in include_from_node1 at ./loading.jl:304
while loading /data/Projects/Energous/B.jl, in expression starting on line 3

However in a julia prompt you can type the contents of B.jl line by line without error if you don't include the module definition.

You might reasonably want to use module A without module B. However if B requires A you will be unable to use B unless you first include A at the highest level. So this means that you have to remember to include the text of every dependent module before you can use it, if they are all in the same working directory.

For my use case this FORCES me to explicitly add the current directory to the path for every script I run in my working directory.

mattcbro on 29 Oct 2015

@mattcbro, you should do

include("A.jl")
using .A

to tell it you want the locally defined A module, not a global A module. This will work and does not require you to modify the LOAD_PATH.

using A is potentially wrong anyway because it could get confused if there is another module called A defined in the load path. So, your experience is actually an argument in _favor_ of the current behavior, because it caught a bug that you otherwise might not have noticed.

stevengj on 29 Oct 2015

👍2

That being said, I still tend to agree with @StefanKarpinski that using .A (_not_ using A) should look for A.jl in the current directory; it's annoying to have to manually include(A.jl), though it's not a huge deal.

(If A.jl is in some other directory, of course, then you need the manual include.)

stevengj on 29 Oct 2015

Is there a technical problem with have using .A search the current directory, or is just a design decision at this point? I would definitely favor having that behavior.

malmaud on 29 Oct 2015

At this point I think it's just a design issue. The fact that you can load code from a parent directory with multiple leading dots is kind of strange. To me there's also the question of whether using .B occurring in module A should load B.jl in the current directory or load A/B.jl. The former would tend to keep directory structures pretty flat, which may be a good thing, while the latter would tend to make them more nested. While I generally favor flatter directory structures (consider how ridiculous Java project file trees are), this would seem to tend to put everything in the top-level directory:

# A.jl
module A
    using .B
    using .C
end

# B.jl
module B
    using .D
    using .E
end

And so on – all of A.jl, B.jl, C.jl, D.jl and E.jl would be in the top-level directory, even though it seems like maybe B and C belong in an A directory and maybe D and E belong in a B directory. Moreover, if you _want_ to have a nested directory structure, how would you even express that?

StefanKarpinski on 29 Oct 2015

It looks like it would be pretty easy to implement: modify the eval_import_path_ function in src/toplevel.c to add an else if (m == jl_current_module) clause after the if (m == jl_main_module), which looks for a var.jl file in the current directory.

stevengj on 29 Oct 2015

@StefanKarpinski, I thought that the proposal was that using .B would look in the directory of the file that the using statement occurs in (or pwd in the REPL). That's what most people would think of as the "current" directory, and is the same as the directory used for include("B.jl").

If you _want_ a nested directory, or any other directory structure, you would just do include("B/B.jl"); using .B manually as you do now. Doing using .B would only look for a B.jl file _if_ B were not already defined.

stevengj on 29 Oct 2015

That was my original proposal, but I'm wondering how one would introduce a nested folder structure using this mechanism? It seems to me that there wouldn't be any way to do it. One option would be to have module A; using .A.B; end be special syntax that loads for "A/B.jl". That would allow having parts of A defined in a directory. Maybe I'm overthinking this.

StefanKarpinski on 29 Oct 2015

@StefanKarpinski, you would introduce a nested folder structure by doing include("B/B.jl"); using .B manually as now; see above. (I edited my post after replying, so maybe you didn't see the 2nd paragraph.)

stevengj on 29 Oct 2015

@stevengj OK that works thanks. However please help me understand. What is the preferred use paradigm for creating and using local modules. Do we really have to have both an include() statement along with a using or import statement?

Perhaps the idea is to have a master script that has all of your includes in them? How do you folks do this? I notice that one person simply uses includes instead of using or imports for their local work.

mattcbro on 29 Oct 2015

@mattcbro, yes, you currently need both include and using. You don't need import (doing include effectively also does import).

I mostly just use include and don't bother with submodules. The only reason to use submodules is if you want to segregate your namespace, but in that case I normally don't want to do using (I just want import and qualified names). For example, in the PETSc.jl module we are using a PETSc.C module for the raw wrappers around the low-level C interface to keep these zillions of functions from polluting the PETSc namespace, but then we use the fully qualified names, e.g. we do C.foo(...) to call the foo function. Hence the C module has no exports and we don't need using C.

stevengj on 29 Oct 2015

Still having to use include seems to defeat a lot of the purpose of this change.

StefanKarpinski on 30 Oct 2015

Would having to have the include also mean you have separate copies of the sub module, instead of a single (possibly pre-compiled) one? If so, that seems like the biggest drawback to me, not the extra typing required.

ScottPJones on 30 Oct 2015

No, you would only have one copy. The purpose would be to save typing the redundant include if you do using .Foo in the common case where Foo.jl is in the same directory. Saving on typing is the only question in this whole thread — there has been and will be no change in functionality.

stevengj on 30 Oct 2015

Saving typing isn't the only issue to me – I'd like to get to a point where you don't need to use include in normal code. To that end, I'd like for relative using to be _the_ way to decompose a module into files and directories. But maybe we as a project don't want that. We should have a conversation about it that doesn't include lots of ill-informed handwringing by people who've barely used Julia about "modularity".