Julia: [proposal] !-notation for variables

Created on 16 Mar 2018  Â·  22Comments  Â·  Source: JuliaLang/julia

Note: This topic has an extensive discussion on https://discourse.julialang.org/t/notation-for-individual-modified-variables/9653 The proposal received some positive feedback there, so I'm reposting it here as a "more official" proposal.

Context

The ! style recommendation for functions that modify their parameters proves extremely useful to me.

Issue

For certain functions with many arguments, functionname!(foo, bar, baz) can be ambiguous: How do I know if foo, bar or baz gets modified, or if potentially multiple parameters get modified?

Proposal

I propose a style convention that clarifies which parameters will be modified. I imagine appending a ! to each modified parameter. If functionname! modifies foo and baz, its declaration would read functionname!(foo!, bar, baz!).

Perks

This convention could improve readibilty in two ways:

  • Function docs could be more readable, since one could immediately infer the modified argument(s), and would not have to rely on guesswork.
  • Function code could be more readable, since one can keep track of the mutated parameter, which is often the "most important" one.

In addition, someone proposed that if this became widely adopted, the compiler could use the information to speed up the code.

Most helpful comment

One issue here is that the vast majority of mutating functions just mutate their first argument. It's pretty rare for a function to mutate multiple arguments, or for there to be ambiguity about what is mutated. So a convention for this is not necessarily warranted.

All 22 comments

I think this is an excellent proposal for a technique that could be profitably added as a good code practice to the Julia manual. Using avar! in the function signature, and, importantly, also in the code body provides the code reader at first sight with a useful hint that the data of the input (mutable struct or array) is being modified inside a function.

Note that function argument names has zero significance and it won't help callsite at all.

It will help the reader of the code.

I mean it won't help the reader of the function call at all.

I see what you mean (I think). But IDEs can help with that by providing the reader with a peek at the docs, which will say something like function foo!(a!, b!, c), no?

In another word, unless you expect all users to read the source code of the library they are reading, which I don't think is reasonable, doing this in the code won't have any effect. If this is just about documentation, which is something that the user IS expected to read, I don't think this is the best way to do that either. The document should just be more explicit about what the function does if there's any such ambiguity.

I would disagree with that. Especially given that the bang says what you need to know in one line of code, no need to read the rest of the docs.

And for someone who NEEDS to read the code of a package, this could be real help.

The whole point of the bang in function name is a hint from the source code (call site) directly, and possibly differentiate mutating and non-mutation versions. Argument name does none of these.

And no, if you need to read the whole document to figure out what is mutated and if that's an important information for the function, the document is not very well written (edit: though due to what I believe to be the case mentioned below, I don't think this should be a concern either.).

Also, FWIW, I don't believe this information (which argument is mutated) is ever the most important thing you want to know about a function. It's much more useful to know what mutation the function is expected to do. I cannot think of a single case where there's a function that mutate multiple arguments and that you'll be able to use the function correctly just knowing which arguments are mutated. It's basically useless without more detail info.

And for someone who NEEDS to read the code of a package, this could be real help.

No. Similar to above, the actual arguments being mutated amount multiple argument is one of the least important information when it's not supplemented by doc or comments about what those mutations actually are.

I'm sorry, but I can't agree with this. Basically you're saying that without this hint you are better off than with it. That is in my opinion nonsense.

No, I'm saying the hint is useless.
The information it present is indeed useful in the doc/comment when combined with detail in the doc/comment but not in the code directly so it may as well be a normal part of the doc/comment.
As an extension to what I said, if you just want a more formal notation/wording for it in the doc, sure that's fine, as long as it's not a style guide for the code. I still don't believe having it in a short notation in the function signature in the doc is useful on it's own without the actual doc but this way it'll always be presented along with the doc (so it'll be more of a guideline for the doc) and it'll not make the actual code messy.

The fundamental issue here is that unlike a function name, which you generally must use when calling the function, argument names, by design are irrelevant when calling a function. So having a convention where the mutated arguments have any kind of special naming does not have the same signalling effect as a convention for function names does. In order to make it effective, there would need to be some kind of enforcement, which is not feasible in a dynamic language because it's the value you want to prevent mutation of, not the variable. For example, it's fine (and not uncommon) to do this:

function f(a!)
    a! = copy(a!)
    return f!(a!)
end

We would have to not have a false alarm for this safe situation. One way to accomplish that would be to require mutable arguments to be locally const so that you can be sure that the name refers to the same value throughout, but that seems like a strange restriction to couple this kind of feature with. Moreover, it still wouldn't prevent this kind of thing: b = a!; f!(b). We could also enforce that assignment from a "mutable" variable like a! be to another mutable variable b!; however, that's very easy to "work around", e.g. b = maybecopy(a!). The maybecopy function might make a copy or might just return its argument. There's nothing illegal about that behavior – since it doesn't mutate its argument ever – but it completely stymies this rule. What it comes down to is that in order to enforce such a rule in any way, one needs a fully static types system for tracking aliasing and mutation. That very rapidly would cause Julia to have a type checker just as hard to learn and use as Rust's, at which point the language becomes a very different beast.

So the only thing that is really possible without turning Julia into a static language is some way of explicitly documenting which arguments are mutated and which are not. There is some question as to whether it makes sense to have syntax for that, but we have no other "just for documentation" syntax in the language so adding such a feature would be quite a departure and does not seem appealing to me at least. It seems better to be careful about documenting this behavior in the documentation itself.

Oh, one more thing about this _specific_ syntax proposal. Currently the ! naming convention is quite simple: if you see a name ending in a ! you can immediately guess that it is a function which mutates some of its arguments. Under this proposal you don't know that anymore. Instead, you know that it's either a mutating function _or_ that its an argument which is mutated. One could argue that it's easy to tell the difference since one is a global and one is an argument, but that's not always true: what about the case of a higher order function that takes a mutating function and uses it for its side effect? That is a local binding to a mutating function but not an argument which will be mutated. E.g. mapslices!(f!, A, dims) – a (hypothetical) function which applies the mutating function f! to each slice of A.

Stefan,

These are all good points. Thanks for your perspective.

P

The names of function parameters (in the callee, not at the call-site) are more accessible than the source code. Consider the current output:

foovec = [];
@which push!(foovec, 1)
#push!(a::Array{Any,1}, item::ANY) in Base at array.jl:658

methods(push!)
# 22 methods for generic function "push!":
#push!(a::Array{Any,1}, item::ANY) in Base at array.jl:658
#push!(B::BitArray{1}, item) in Base at bitarray.jl:764
...

In cases of function that possibly mutate multiple inputs, this would become much more informative if the declaration had exclamation marks. Yes, this is a coding-style / documentation issue only; but at least I often misuse the method table as documentation (because it is quite often that there are undocumented variants which might turn out to do the right thing!). And I really like the fact that parameter names, even though computationally meaningless, are preserved in such lookups; it would be even better of they had exclamation marks. In the official style-guide, and eventually in most of base (since this is a source-style only question, the big renaming could go pretty slowly).

I find the problem itself pretty undermotivated – I'm not convinced that uncertainty about which argument is modified by a mutating function is such a big issue that we need something like this. There are also other ways that we could attach such documentation metadata without causing the straightforward mutable function naming convention to become ambiguous.

I think there's a better way to do this that requires no special syntax:

struct Change{T}
    x::T
end

function push(x, args...) = _push(x.x, args...)
function push(x::Change, args...) = _push!(x.x, args...)

Where _push and _push! are unexported. Mostly useful for refactoring, could potentially half exports.

Sorry, I don't see how this is beneficial. Could you elaborate please?

Maybe it isn't then. I seem to remember that for a while we had functions like A_mul_B! and A!_mul_B. Instead we could have Change(A) * B and A * Change(B)

One issue here is that the vast majority of mutating functions just mutate their first argument. It's pretty rare for a function to mutate multiple arguments, or for there to be ambiguity about what is mutated. So a convention for this is not necessarily warranted.

@bramtayl : thanks, I just could not see where you were going with it. This seems actually quite useful.

Thanks y'all for the insights. I'm convinced now that there are better ways to go about the things this proposal tries to address. Closing this proposal.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Keno picture Keno  Â·  3Comments

arshpreetsingh picture arshpreetsingh  Â·  3Comments

yurivish picture yurivish  Â·  3Comments

ararslan picture ararslan  Â·  3Comments

wilburtownsend picture wilburtownsend  Â·  3Comments