Julia: add a cache to reuse identical anonymous functions in the REPL

Created on 20 Mar 2017  ·  11Comments  ·  Source: JuliaLang/julia

The ApproxFun and SingularIntegralEquations packages were built in-part to give users the feeling they are "numerically computing with functions," to quote Nick Trefethen.

When they started in Julia v0.3, generic functions did not require a re-compilation of Fun constructors. Since Julia v0.5, generic functions are now their own types and the constructors require re-compilation.

In certain cases, the re-compilation is not so bad:

julia> versioninfo()
Julia Version 0.6.0-pre.alpha.173
Commit 93cddfc* (2017-03-18 17:19 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin16.4.0)
  CPU: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)

julia> using ApproxFun # branch: infinity-is-a-free-man

julia> @time Fun(x->exp(x))
  4.890473 seconds (7.01 M allocations: 299.036 MiB, 3.57% gc time)
Fun(Chebyshev(【-1.0,1.0】),[1.26607, 1.13032, 0.271495, 0.0443368, 0.00547424, 0.000542926, 4.49773e-5, 3.19844e-6, 1.99212e-7, 1.10368e-8, 5.5059e-10, 2.49797e-11, 1.03911e-12, 3.99195e-14])

julia> @time Fun(x->exp(x))
  0.388620 seconds (397.03 k allocations: 18.762 MiB, 2.22% gc time)
Fun(Chebyshev(【-1.0,1.0】),[1.26607, 1.13032, 0.271495, 0.0443368, 0.00547424, 0.000542926, 4.49773e-5, 3.19844e-6, 1.99212e-7, 1.10368e-8, 5.5059e-10, 2.49797e-11, 1.03911e-12, 3.99195e-14])

Of course, declaring a generic function, we see that the run-time calculation is actually very fast:

julia> g = x->exp(x)
(::#5) (generic function with 1 method)

julia> @time Fun(g)
  0.412938 seconds (396.95 k allocations: 18.764 MiB, 2.98% gc time)
Fun(Chebyshev(【-1.0,1.0】),[1.26607, 1.13032, 0.271495, 0.0443368, 0.00547424, 0.000542926, 4.49773e-5, 3.19844e-6, 1.99212e-7, 1.10368e-8, 5.5059e-10, 2.49797e-11, 1.03911e-12, 3.99195e-14])

julia> @time Fun(g)
  0.000659 seconds (437 allocations: 17.625 KiB)
Fun(Chebyshev(【-1.0,1.0】),[1.26607, 1.13032, 0.271495, 0.0443368, 0.00547424, 0.000542926, 4.49773e-5, 3.19844e-6, 1.99212e-7, 1.10368e-8, 5.5059e-10, 2.49797e-11, 1.03911e-12, 3.99195e-14])

On the other hand, for more complicated constructors, such as bivariate constructors in ApproxFun and SingularIntegralEquations, the compile-time is now beyond interesting:

julia> using SingularIntegralEquations # branch: development

julia> g = (x,y)->1/2
(::#7) (generic function with 1 method)

julia> @time G = GreensFun(g,CauchyWeight(Chebyshev()⊗Chebyshev(),0))
 15.104788 seconds (17.73 M allocations: 815.889 MiB, 3.36% gc time)
GreensFun with kernels:
{
 LowRankFun on π⁻¹log|y-x|[Chebyshev(【-1.0,1.0】)⊗Chebyshev(【-1.0,1.0】)] of rank 1.
}

julia> @time G = GreensFun(g,CauchyWeight(Chebyshev()⊗Chebyshev(),0))
  0.002161 seconds (6.37 k allocations: 214.891 KiB)
GreensFun with kernels:
{
 LowRankFun on π⁻¹log|y-x|[Chebyshev(【-1.0,1.0】)⊗Chebyshev(【-1.0,1.0】)] of rank 1.
}

but again, the run-time is actually quite fast.

I'd like to know what the plan is for this scenario -- certainly the brilliant minds behind Julia have thought of this. With compilation times this high (substantially higher than when the ApproxFun and SingularIntegralEquations packages started), it's hard to call this aspect of the current rendition of functions in Julia a dynamic experience.

Most helpful comment

We should use some code like that to automate this in the REPL. If you're willing to type @cached (I'm not) you could also just give your functions names, f1 = x->x+1, and reuse them that way.

All 11 comments

CC @dlfivefifty.

Can you compare the timings with Julia 0.5.1? There might be a regression in compilation time too (meaning that it could be faster).

I know it's only an example, but it will help to use exp instead of x->exp(x). Other than that there are a few answers:

  1. We can work on compiler performance. For example #21081 --- these things can be fixed sometimes.
  2. The REPL could implement a cache to avoid creating new functions when an identical anonymous function is entered again.
  3. If you want 0.3's latency tradeoff again, you can use something like
struct F <: Function
    f
end
(f::F)(args...) = f.f(args...)

Wrapping functions in that will avoid specializing code for each function.

I just tried the struct trick, and it works very well. Thanks @JeffBezanson.

Thank you very much @JeffBezanson. Of course future improvements will always be welcome, but I've implemented the struct wrapping for constructors and the compilation times for individual generic functions are back down. This is the same code as above.

julia> versioninfo()
Julia Version 0.6.0-pre.alpha.173
Commit 93cddfc* (2017-03-18 17:19 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin16.4.0)
  CPU: Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)

julia> using ApproxFun # branch: infinity-is-a-free-man

julia> @time Fun(x->exp(x))
  4.434731 seconds (5.99 M allocations: 262.131 MiB, 2.76% gc time)
Fun(Chebyshev(【-1.0,1.0】),[1.26607, 1.13032, 0.271495, 0.0443368, 0.00547424, 0.000542926, 4.49773e-5, 3.19844e-6, 1.99212e-7, 1.10368e-8, 5.5059e-10, 2.49797e-11, 1.03911e-12, 3.99195e-14])

julia> @time Fun(x->exp(x))
  0.004825 seconds (816 allocations: 38.563 KiB)
Fun(Chebyshev(【-1.0,1.0】),[1.26607, 1.13032, 0.271495, 0.0443368, 0.00547424, 0.000542926, 4.49773e-5, 3.19844e-6, 1.99212e-7, 1.10368e-8, 5.5059e-10, 2.49797e-11, 1.03911e-12, 3.99195e-14])

With a generic function:

julia> g = x->exp(x)
(::#5) (generic function with 1 method)

julia> @time Fun(g)
  0.005058 seconds (799 allocations: 37.367 KiB)
Fun(Chebyshev(【-1.0,1.0】),[1.26607, 1.13032, 0.271495, 0.0443368, 0.00547424, 0.000542926, 4.49773e-5, 3.19844e-6, 1.99212e-7, 1.10368e-8, 5.5059e-10, 2.49797e-11, 1.03911e-12, 3.99195e-14])

julia> @time Fun(g)
  0.000681 seconds (485 allocations: 18.094 KiB)
Fun(Chebyshev(【-1.0,1.0】),[1.26607, 1.13032, 0.271495, 0.0443368, 0.00547424, 0.000542926, 4.49773e-5, 3.19844e-6, 1.99212e-7, 1.10368e-8, 5.5059e-10, 2.49797e-11, 1.03911e-12, 3.99195e-14])

The change in run-time is negligible (in the perceived experience.)

The more complicated bivariate constructors are substantially faster again and can work with generic functions. (I forgot to show @time G = GreensFun((x,y)->1/2,CauchyWeight(Chebyshev()⊗Chebyshev(),0)) above which took about 6 seconds. Now it's down to 0.0058 seconds after compiling constructors once!)

julia> using ApproxFun # branch: infinity-is-a-free-man

julia> using SingularIntegralEquations # branch: development

julia> @time G = GreensFun((x,y)->1/2,CauchyWeight(Chebyshev()⊗Chebyshev(),0))
 12.237095 seconds (16.23 M allocations: 722.966 MiB, 2.69% gc time)
GreensFun with kernels:
{
 LowRankFun on π⁻¹log|y-x|[Chebyshev(【-1.0,1.0】)⊗Chebyshev(【-1.0,1.0】)] of rank 1.
}

julia> @time G = GreensFun((x,y)->1/2,CauchyWeight(Chebyshev()⊗Chebyshev(),0))
  0.005827 seconds (31.43 k allocations: 615.939 KiB)
GreensFun with kernels:
{
 LowRankFun on π⁻¹log|y-x|[Chebyshev(【-1.0,1.0】)⊗Chebyshev(【-1.0,1.0】)] of rank 1.
}

Caching identical anonymous functions is something it would be nice to have in many contexts, e.g. the dot-call broadcasting generates lots of anonymous functions.

Although it's undecidable to compute whether two functions compute the same thing, deciding that two functions are syntactically equivalent and refer to the same global objects should be possible.

Renaming this to reflect that caching functions is the main to-do here.

I think the issue is slightly broader than caching, as in interactive use one uses a lot of different anonymous functions. While your struct F solution does work, it's a bit annoying in the REPL.

It would be nice if you could flip a switch, something like @dynamic on and @dynamic off, so that all anonymous functions are dynamically dispatched instead of compiled, a la struct F

How about this @cached macro (based on MikeInnes's fn macro)?

"""Caches anonymous functions"""
expr_dict = Dict{Expr, Symbol}()
macro cached(expr::Expr)
    @assert expr.head in (:function, :->)
    expr = Base.remove_linenums!(expr)
    if haskey(expr_dict, expr)
        return expr_dict[expr]
    end
    name = gensym()
    args = expr.args[1]
    args = typeof(args) == Symbol ? [args] : args.args
    body = expr.args[2]
    @eval $name($(args...)) = $body
    expr_dict[expr] = name
    name
end

f1=@cached x->x+1
f2=@cached x->x+1
f1==f2

We should use some code like that to automate this in the REPL. If you're willing to type @cached (I'm not) you could also just give your functions names, f1 = x->x+1, and reuse them that way.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sbromberger picture sbromberger  ·  3Comments

musm picture musm  ·  3Comments

ararslan picture ararslan  ·  3Comments

m-j-w picture m-j-w  ·  3Comments

i-apellaniz picture i-apellaniz  ·  3Comments