Julia: Wishlist: caching source code

Created on 25 Aug 2017  路  16Comments  路  Source: JuliaLang/julia

Several of us would like it if Julia maintained a source-text cache for all code, in addition to the current cache for lowered/inferred/compiled code. (See #22721, with "votes" from developers of Revise, Gallium, and a tracing package.) I'm putting a 1.0 milestone on the decision, but I'm afraid I lack the time to work on the implementation of this feature. Feel free to remove this from the milestones as appropriate.

decision

Most helpful comment

The underlying file might have been modified since parsing.

All 16 comments

See also #2625.

I also want this, but it's definitely not 1.0-blocking since we can always add this functionality in 1.x.

Maybe i'm looking into the wrong direction, but isn't a reference of filename/path/linenumber from-to enough?

The underlying file might have been modified since parsing.

but it's definitely not 1.0-blocking since we can always add this functionality in 1.x.

Good point, I moved the milestone to 1.x.

Just do add another +1 for this, it would allow me to greatly simplify Query.jl in one key part.

Implemented as part of #23898.

Closed by #23898. You can get the raw source code (as text) from read_dependency_src(cachfilename, srcfilename) (where cachefilename is the *.ji file).

@timholy From your comment in #22721

However, I wonder if this might be solvable via methodswith plus maintaining a cache of the source-code expressions so that they can be re-evaluated. (Revise would like that anyway, since it re-parses and caches every source file so that it can detect diffs. This adds considerably to the package load time, but caching the Exprs to the .ji file seems both cheap and effective.)

I expected that there would be a new function code_expr(which(sin, (Float64,))) that would return the :(function sin(...) ...) Expr object from which that method was defined. Is there any way to write code_expr from read_dependency_src?

You don't need #23898 for that, you can do that now:

julia> foo(x::Int) = 1
foo (generic function with 1 method)

julia> foo(x::Float64) = 2
foo (generic function with 2 methods)

julia> m = first(methods(foo, Tuple{Float64}))
foo(x::Float64) in Main at REPL[2]:1

julia> Base.uncompressed_ast(m)
CodeInfo(:(begin 
        nothing
        return 2
    end))

The reason I want(ed) that for #22721 is because with renaming, the types need to be interpreted syntactically not semantically.

23898 is something simpler: some folks seemed to want the source pre-parsing (e.g., to preserve whitespace), so I decided just to store the raw text to make sure everyone could use it. For Revise, the main advantage is that in https://github.com/timholy/Revise.jl/pull/49 it's no longer necessary to parse all the source files when you first load a package: you can afford to wait until you detect that some file has changed, and just focus on the changed files. That reduces the overhead from loading. (You couldn't do that before, because once the file changes you can't "go back" and figure out what the original looked like, but now that there's a cache this is straightforward.)

Doesn't uncompressed_ast return lowered code? I could probably use Sugar.jl to retrieve something like the original expression, but it seems silly (and error-prone?) to go through that when Base could store the Expr for each method. Perhaps I'll open an issue...

I have exactly the same need as @cstjean, i.e. I would love to have easy access to the original AST of an arbitrary function, not the lowered version.

You can already get it now for Base or any precompiled package if you're willing to parse the cache. Check two open PRs in Revise for guidance about how to do that (how to get the source text, and how to localize a specific method signature).

It would be nice to have an extra field in a Method for the unlowered code, but I'm not sure whether that has image size/performance implications that are unattractive. I decided on the least-invasive approach I could come up with that still let Revise do what it needed to do---by having the source text cached to a file, there is literally no overhead on Base except a very small one during build phases.

For my use-case in Query.jl I would need this to work for arbitrary functions, not just things in Base or packages. Completely understood why you took the approach you did for your use-case!

How about opening a new issue with your specific requirements?

2625 also requires a general solution that works for e.g. anonymous functions.

Was this page helpful?
0 / 5 - 0 ratings