Julia: globals 鈥撀爉ake their performance less awful

Created on 1 Nov 2014  路  53Comments  路  Source: JuliaLang/julia

This keeps coming up, and rather than direct people to the performance tips, I feel like we should really just try to fix the issue. One relatively non-disruptive approach might be to emit code that accesses globals with a fast path that assumes that they have the same type that they currently have and a slow path that handles arbitrary type. A harder fix would be the compile-time version of that: generate code that assumes that globals retain their current type and invalidate it if the type changes. One might want to let compilation happen a couple of times before giving up and generating pessimistic general code.

Related: #265, #524, #964

performance

Most helpful comment

Potential proposal (needs some work, but it's the most promising thing we could come up with on today's triage call):

  1. Implement type annotations for globals (#964).
  2. Globals without explicit type annotations are implicitly type-const.
  3. Error if an explicit global type annotation is violated anywhere.
  4. Error if an implicit global type annotation is violated in non-top-level code.
  5. Warn if an implicit global type annotation is violated in top-level code.

For 1.0 only the behavioral part of this has to be implemented, but this scheme allows making globals fast with some relatively straightforward optimizations. Most globals don't change type, and this allows efficient code to be generated under that assumption. If those type assumptions are violated at the top-level, methods relying on those assumptions can simply be invalidated. This allows relatively convenient use at the REPL, while still fixing the global performance problem.

All 53 comments

+ a lot

It's somewhat fun to examine a disappointing benchmark for non-const globals, and get a 10-100 times improvement, but it would be much nicer if we wouldn't have to do that.

If this isn't easy to fix quickly, I really like @ihnorton's heavy-handed solution: include a message about not writing code in the global scope in the Julia banner.

This is a feature --- it makes performance problems more obvious.

I'm not sure if you're serious or not.

What about having a way to assert that the type of a variable does not change? Assignments to such variables will only allow you to assign values of that type or values coercible to that type, otherwise the program terminates in error. With this knowledge, the compiler can produce fast paths for those variables.

This might also be helpful for other optimizations, although I don't know exactly what.

We have const which in practice is only an assertion that the global will not change type.

Is that a bug or a feature?

Not quite sure. It might also be a signal to the compiler that the value will be constant, and that it can inline the value, if it wants, so it is not necessarily good advice to make global immutables const (if you depend on code picking up the change).

Fixing this "feature" would cut down on julia-users traffic by about 5%, e.g.:

https://groups.google.com/forum/#!topic/julia-users/7FUx2flIxac

That's a possibility but you would have to then save values back to global scope at the end. During execution, changes to globals would not be reflected elsewhere which could lead to significant confusion.

Suppose I write:

param = 1
myfunc(param)

Since now I have a global variable, will I get a relatively poor performance?

Or should I write instead:

function main()
  param = 1
  myfunc(param)
end

main()

The first version is fine. It is the lookup from inside compiled code which is slow, since the compiler cannot know a priori the type of the result.
If you call func(param) the function will be compiled for an Int parameter, and no global lookup will happen inside.

Here's my experiment, and I find no significant performance difference in version 0.3.7

param = 0.001
@time sleep(param)
function main()
  param = 0.001
  @time sleep(param)
end
main()

Both take 2.1 ms.

Think of it this way: _accessing_ a global variable is slow. So the existence of a global variable in your code might not matter if you only read it once. You don't want to read or write a global variable repeatedly in a loop. But if you read its value once and pass it to a function, there's no problem.

In this case the overhead of sleep will be much bigger than anything else; you won't get meaningful times.

The remark of @JeffBezanson seems reasonable. I did this experiment:

@time begin
  p = 1
  for i in 1:100000
    p += 1
  end
end

elapsed time: 0.002861372 seconds (1591840 bytes allocated)

function main()
  @time begin
    p = 1
    for i in 1:100000
      p += 1
    end
  end
end

main()

elapsed time: 1.676e-6 seconds (0 bytes allocated)

But what remains mysterious to me is that why the compiler can't treat global variables just in the same way as local variables. And how to explain the huge difference of memory allocation in the above example?

Globals are different because they can be observed anywhere when they change. To convert a global to a local, the compiler would have to prove that no possible code anywhere in the system can see the changing value of p during the loop.

The memory allocation is from boxing. Even the type of the global might change; if it did, and you didn't box things, you'd get a segfault.

Actually, I don't know how the compiler treats local or global variables. I'm not from the computer science community. But I try my best to guess what you mean. It seems to me that you are worrying about the _multi-threading_ situation.

No, just this:

julia> globvar = 0
0

julia> function foo()
           s = 0
           for i = 1:5
               bar()
               s += globvar
           end
       end
foo (generic function with 1 method)

julia> function bar()
           global globvar
           globvar = "oops"
           nothing
       end
bar (generic function with 1 method)

julia> foo()
ERROR: MethodError: `+` has no method matching +(::Int64, ::ASCIIString)
Closest candidates are:
  +(::Any, ::Any, ::Any)
  +(::Any, ::Any, ::Any, ::Any...)
  +(::Int64, ::Int64)
  ...
 in foo at none:5

@timholy, typically your example only illustrates how dangerous global variables could be and it is not a good programming habit. But I try to understand what you implied. You imply that if the compiler had treated the globvar in the same way as a local variable, this run-time error wouldn't have been detected. Isn't it?

If this is the case, I further deduce that the compiler interprets a local variable as a fix address in the memory, and a global variable as a name in a "variable list".

Exactly. Julia creates optimized code that depends upon how the machine represents information: the code for adding two Float64s is different from the code for adding two Float32s or two Ints. If you're wrong about the representation, you either get complete junk or a segfault, depending on the nature of the code.

"Boxing" just means wrapping the machine representation with some additional information that allows you to ask, "what type of object is this, anyway?" It takes memory, however.

Memory for the box, time to handle the indirection (and to ask the "what are you" question), and possibly time to perform a dynamic dispatch, depending on context. It's turtles all the way down when types are unpredictable.

Thanks for your reply!

I have just found an interesting phenomenon:

@time begin
  p = 1
  for i in 1:510
    p += 1
  end
end

elapsed time: 2.9961e-5 seconds (0 bytes allocated)

@time begin
  p = 1
  for i in 1:511
    p += 1
  end
end

elapsed time: 2.8285e-5 seconds (16 bytes allocated)

Within 510 iterations, there are no extra memory cost. From the 511th iteration, every extra iteration takes 16 bytes more. Does it mean that there is no boxing within 510 iterations?

The first 512 integers (0-511) are preallocated and boxed versions of them are always kept around. After that you need to actually allocate a new object.

Now I understand. Let's go back to our original discussion about why to avoid global variables.

I think what you are really against is not the global variable itself, but loop involving global variables, or in other words, using global variables in any loop.

If you use a global variable in some function and then that function gets called in a loop, you have the same problem 鈥撀燼nd if you're a library author, there's no way to predict how your function will get called. So I don't see how the "in a loop" qualification really helps. The only way you can be sure that your code won't be called in a loop is if it happens once at global scope, in which case, sure it's fine to use globals.

@StefanKarpinski, it is generally considered as a bad habit to use global variables in any function. It is so for _all_ languages, not specifically for Julia. Therefore, if I write the functions myself, or the library is written by some skilled developer, I can safely use global variables by following my previous remark. Moreover, my remark helps _quantify_ the overhead of the global variable, which was considered as _demon_ previously.

In order to better _quantify_ the overhead, I permit myself to ask this question:
If I write all Julia codes in the global scope, is the performance going to be as poor as writing in other languages (matlab, R, python...) ? Of course, without calling C/Fortran functions.

If you use a variable whose type depends on a global variable in a loop, you will still have a performance problem, even if you don't use the global variable itself in a loop. You can see this quite clearly with @code_typed/@code_warntype. There is no mystery, just type inference.

If you write your code in the global scope, performance will most likely be worse than Python, which has a highly optimized interpreter, and less overhead for calling functions than Julia when types are not known (because Python has no generic functions). I don't know how it would compare to MATLAB and R, which have substantially worse interpreters.

Thanks @simonster , your example is persuasive. It makes me have to change my remark to avoiding loop involving unknown type (variables) .

About your second remark, em... writing Julia code has a risk!

(Another remark somewhat irrelevant : when I was learning matlab and R, the biggest pitfall was writing a loop. Now I am learning Julia, the biggest pitfall is writing a loop involving an unknown type. Instead, If we tell people that the biggest pitfall for Julia is using global variables, we will mean we simply replace a pitfall by another. This won't give Julia much honor, and even less persuasive for users to learn Julia, although we won't use the word "pitfall" in our advertisement and won't do this comparison.)

@Sisyphuss, the thing you should realize: this is basically a trap only for newcomers. If you decide to stick with julia, I predict that you'll look back on this and wonder what you were so worried about.

A good step is to get to know the tools we have for diagnosing performance problems. See http://docs.julialang.org/en/release-0.3/manual/performance-tips/. Of course, we're always interested in people who make improvements to those tools. I know @tonyhffong is really hoping to find some collaborators to work on his excellent Lint.jl, which can warn you about this very problem.

Thanks @timholy, I have read the manual once, and I'm reading it for the second time. I don't _decide_ to stick with Julia; I _already_ stick with her. I'm not worried about it; I just try to understand her.

The package you mentioned is fine for learning, but I am already using Julia in my current research work.

[pao: fix at-reference]

@Sisyphuss you can't abbreviate GitHub usernames--this may notify other users on GitHub that are not a part of this discussion and possibly annoy them. Thanks!

I just ran into a rather curious version of this issue while implementing a character streaming file parser.

Here are 3 naive ways of iterating through a file character by character:

f = open(filename)

@time while !eof(f)
  read(f, UInt8)
end

seek(f, 0)

@time while true
   read(f, UInt8)
   eof(f) && break
end

seek(f, 0)

@time while true
    try
        read(f, UInt8)
    catch e
        isa(e, EOFError) ? break : rethrow(e)
    end
end

close(f)

For a particular 200MB file, I get these timings

| version | timing |
| --- | --- |
| while !eof | 36.055 seconds |
| eof && break | 18.663 seconds |
| catch EOFError | 16.732 seconds |

If I wrap the exact same code in a function and call it, I get

| version | timing |
| --- | --- |
| while !eof | 3.740 seconds |
| eof && break | 3.526 seconds |
| catch EOFError | 6.927 seconds |

This example demonstrates that timing comparsions in global scope can have absolutely no correlation between the same comparisons done in a function, which arguably is a much worse gotcha than just having uniformly slow globals.

Is this surprising? Globals are in part slow because they inhibit optimization within the compiler. Predicting which optimizations are applied by the compiler is always perilous, so this is not really a gotcha.

Is this surprising?

Perhaps not to those of us jaded by trying to second-guess the compiler, but I'd think that most people would find this surprising.

This is not second guessing, it is just a fact of life with a JIT compiler.

this is not really a gotcha.

New users unaccustomed to the subtleties of the Julia compiler are the ones most likely to write such code in global scope. And this is not a new observation.

This makes the "first encounter experience" with the language bad for many people, and there are things we can do to fix it.

This is probably a dumb idea, but would it be possible to instrument access to globals and then print out a warning if Julia sees a global accessed a bunch of times in a short time period? That way we'd still guide people in the right direction if they're likely hitting a performance bottleneck, but not annoy folks playing round in the REPL in global scope.

It's possible, but I still feel it would be better to simply fix the problem. People do want to do computations in the REPL and have it not be horribly slow. It's also not that uncommon to write scripts that do stuff at global scope and while it's not a big deal, having to wrap it all in a function is annoying. I think that after ambiguity warnings, this is the lowest hanging fruit for making the "Julia experience" nicer.

Would it be crazy to just not allow the type of globals to change?

Or maybe have them type-const by default, with a keyword (nonconst?, volatile) to indicate that the type can change.

Not at all: https://github.com/JuliaLang/julia/issues/964#issuecomment-7468492. But I suspect that it would be nicer to allow type annotations for now (partial mitigation of the issue) and then implement the optimistic type-stability assumption with invalidation of code that relies on the assumed type. It's more work, but it's a better experience for users.

:+1: :100: to @StefanKarpinski 's cunning plan with the invalidation of code. Could the type annotations be done in time for 0.4? (maybe wishful thinking)

Is there any way to be able to use optimizations with literals in the global scope?

function f(x)
  x^2
end

@code_warntype f(2)
Variables:
  #self#::#f
  x::Int64

Body:
  begin 
      return (Base.mul_int)(x::Int64, x::Int64)::Int64
  end::Int64

@code_warntype 2^2
Variables:
  #self#::Base.#^
  x::Int64
  p::Int64

Body:
  begin 
      return $(Expr(:invoke, MethodInstance for power_by_squaring(::Int64, ::Int64), :(Base.power_by_squaring), :(x), :(p)))
  end::Int64

I know that ^ can handle integer literals and for example inline that ^2 is x*x, but it only does that in a function. In the global scope, wouldn't you still have enough information to do these optimizations, and wouldn't that at least help global scope loops a little bit?

Type assertions are also a really good idea, or at least a typeconst. It might seem like over-typing, but people are essentially using const to do the same thing already, so it might as well work correctly. const has other side-effects, making this essentially a trap for users don't realize const is not just a performance enhancer for constant type.

Potential proposal (needs some work, but it's the most promising thing we could come up with on today's triage call):

  1. Implement type annotations for globals (#964).
  2. Globals without explicit type annotations are implicitly type-const.
  3. Error if an explicit global type annotation is violated anywhere.
  4. Error if an implicit global type annotation is violated in non-top-level code.
  5. Warn if an implicit global type annotation is violated in top-level code.

For 1.0 only the behavioral part of this has to be implemented, but this scheme allows making globals fast with some relatively straightforward optimizations. Most globals don't change type, and this allows efficient code to be generated under that assumption. If those type assumptions are violated at the top-level, methods relying on those assumptions can simply be invalidated. This allows relatively convenient use at the REPL, while still fixing the global performance problem.

Error if an implicit global type annotation is violated in non-top-level code.

So this is saying that if a function tries to modify a global in a non-type-stable way, it will error. That sounds good. But

Warn if an implicit global type annotation is violated in top-level code.

So then

x = 2
x = 2.0

that will warn every time?

x = 2; x = 2.0 would warn, yes. I'm not sure what you mean by "every time" 鈥撀爄t would warn every time you evaluate it at the top-level, yes.

After discussion on triage, I propose we only implement item 1 (type annotations for globals, #964), and close this issue immediately. There doesn't seem to be any consensus that the other proposals here are necessary, or even uniformly useful.

The triage winds have turned here and there are serious usability and feasibility concerns about this. Everyone agrees that #964 should be implemented and with that you can conveniently express both that a global is constant and that it is type-constant (e.g. x::Int = 0). Performance could be fixed without changing the current semantics with clever on-stack-replacement since you could just invalidate and replace any method whose code depends on the type that is changed.

There's still a performance issue here, we just decided not to make any semantic changes.

Was this page helpful?
0 / 5 - 0 ratings