Julia: Notation for lazy map

Created on 2 Nov 2016  Â·  58Comments  Â·  Source: JuliaLang/julia

The notation f.(v) effectively maps f over an array v. This operation is not lazy, it materializes the mapped array.

I think it could be useful to have an analogous notation for lazy maps. Maybe f..(v)? Instead of materializing the mapped array, this should return something equivalent to the generator:

(f(x) for x in v)

For example, this could be useful to do something like:

sum(f..(v))

which is more efficient than materializing the intermediate array in:

sum(f.(v))

Of course right now sum takes an optional function argument, so one can write sum(f,v). But see the discussion here: https://github.com/JuliaLang/julia/issues/19146. If one decides to remove the method sum(f,v), I think the notation sum(f..(v)) could be a nice alternative.

broadcast

Most helpful comment

Now that broadcasted objects are iterable, we should bikeshed a syntax for a non-materializing dot call. Just to start things off, maybe a non-binding straw poll of some non-breaking options (which eliminates f..(x)):

  • :thumbsup: for for f.(x) (much like the current x for y in z generator syntax)
  • :tada: for .(f.(x))
  • :heart: for @lazy f.(x) (collides with Lazy.jl) or @lazy f(x) (acting like @.)
  • 😕 for none of the above

All 58 comments

Ref. discussion around https://github.com/JuliaLang/julia/issues/16285#issuecomment-237320513 (generalizing compact broadcast syntax to compact higher-order function syntax, relevant given the behavior mentioned above seems like mapreduce). Best!

That's the problem with special syntax; there's never enough of it. I think the dots are pretty obscure in this case. What might be nice, and has been proposed before, is f(2x + y) over x, which could expand to f(2x_ + y) for x_ in x.

f(2x + y) for x in x works and is not much more verbose than f(2x + y) over x, so I don't really see a big need for over.

(One annoyance is the precedence: shouldn't for have lower precedence than =? Currently, g = f(2x + y) for x in x gives a syntax error.)

In #16285 we discussed making sum..(f.(x)) (or maybe sum:(f.(x))) equivalent to sum(f, x). The advantage of this is that it would be fusing, e.g. sum..(f.(g.(x)) would be equivalent to sum(x -> f(g(x)), x).

I did not know about https://github.com/JuliaLang/julia/issues/16285.

Anyway, I think I prefer lazy map than materializing map. IMHO, I'd prefer that f.(v) returns the generator:

(f(x) for x in v)

If I wanted to materialize it I could collect(f.(v)).

On the other hand, I cannot "unmaterialize" f.(v) if it returns an array. Hence I think lazy f.(v) is more general and could be more useful. Additionally, "fusing" is automatic with lazy f.(v).

f.(v) is a broadcast operation, which is more general in other ways than map or generators. (The functionality of map and broadcast intersect, but neither subsumes the other.)

It is an interesting question whether one could make a lazy broadcast that is as efficient as a fusing materializing broadcast. That is, suppose f.(x,y) returned an AbstractArray subtype that lazily computed its elements. I'm guessing that the answer in practice is "no", unfortunately.

For example, suppose that you cascade a sequence of these operations: z .= abs.(x.^2 .+ sin.(sqrt.(y .* 2))). Our fusing materializing broadcast can completely erase the overhead of the composition, and gives performance essentially identical to a handwritten loop overwriting z in-place. Instead, imagine that each of these operations created a lazy AbstractArray, so that indexing the right-hand side of the assignment cascaded through a sequence of lazy getindex calls. Although no large temporary arrays would be allocated, I think we could have a hard time convincing the compiler to inline and elide all of those getindex calls to eliminate the overhead.

Yep --- with laziness, performance variance tends to go up.

Pardon my ignorance, but is it possible to "fuse" the generators (as is done in the materializing broadcast)? That is, is it possible to fuse

abs.(sin.(v))

into

(abs(sin(x)) for x in v)

instad of

(abs(x) for x in (sin(y) for y in v))

Can this be done automatically by the compiler? Then the efficiency issues of the lazy f.(v) would go away, right?

Of course; if broadcast were lazy the exact same fusing would still happen. I believe the efficiency concerns are, for example, recomputing when the same element is accessed twice. With the lazy version some code would get faster but some code could get slower.

My point is that with lazy f.(v) you can materialize if you want (collect(f.(v)), which is still pretty concise), but if f.(v) materializes, you are forced to use generators to get the lazy version.

It's a matter of taste then:

Either you prefer short syntax for materializing broadcast, and use long syntax for lazy map (generators) (This is the way it is now).

Or you prefer slightly larger syntax for materializing broadcast (collect(f.(v))) and short syntax for lazy broadcast (f.(v)).

I had been playing with the idea of explicit (higher-order?) function composition (#17184, https://github.com/FugroRoames/CoordinateTransformations.jl) and I find that creating compositions and then (possibly later) applying them to functions is an incredibly powerful way of doing "lazy" evaluation. The idea is that f ∘ g makes a new function (or function-like object) such that (f∘g)(x) = f(g(x)). It would be straightforward to create a similar operator, lets call it .∘ (probably bad syntax), which chains together broadcasted functions in a similar way ((f.∘g)(x) = f.(g.(x)))

This leads to a few interesting ways of doing things - you can (a) construct the entire chain of functions until the point you want the "full" output relatively easily and then call that, or (b) wrap these up in some kind of "lazy" broadcasting composition (maybe with another operator), or (c) perform static (or dynamic) analysis of the function chain to optimize it, in the same way e.g. SQL will optimize a query. I had started playing with this last idea but haven't had any time to commit to it - but in principle it would be a bit like Query.jl but without any macros.

I suspect that (b) is very close to what you want - and the great thing is that you don't need _any_ further support from the parser, the compiler or Base to achieve it. Just chose one of the many available operators and have fun!

Now that we internally have a lazy broadcast object via #26891, it might be nice to have a syntax so that fused dot calls like x .+ f.(x) lower to the lazy wrapper without the materialize(...) call. cc @mbauman.

Available (non-breaking) syntaxes seem to include .(x .+ f.(x)), (for x .+ f.(x)), @lazy x .+ f.(x).

One could also make a macro @lazy or @lazydot that is analogous to @views in that it would make all dot calls in a block of code lazy.

In #16285, the syntaxes sum:(x .+ f.(x)) and sum..(x .+ f.(x)) were also suggested. Since : and .. are already infix operators, however, this could in principle be breaking.

I would also love something like run(bc::Broadcastable) (perhaps with special syntax), which acts like a 'broadcasted foreach' for functions with side effects. In contrast to materialize(bc::Broadcasted), which allocates and populates an output array, it would simply call the broadcasted function with the appropriate arguments while discarding the return values.

Perhaps semicolon syntax could be used as a sign that the expression is to be run for side effects alone: x .+ f.(x);

[edit: on second thought, I don't think this would be as simple as it seemed, since you could imagine saying a = x.+ f.(x); which would want to materialize]

A macro could be a simple solution, e.g. @discard f.(x, y, z).

@tkoolen, that seems to be subsumed by a non-materializing broadcast syntax, since then you could just use foreach

For example, if we adopt the (for ...) syntax for non-materializing broadcast, you could do foreach(for g.(f.(x))).

foreach(for g.(f.(x)))

That'd be a bit of a pun since Broadcasteds aren't callable… and if they were, I'd expect foreach(::Broadcasted) to be executed exactly once (like foreach(println)). We could probably look past that, but the technically correct version would be something like:

bc = (for g.(f.(x))) # or whatever
foreach(i->bc[i], CartesianIndices(axes(bc)))

@mbauman, my thinking is that the unmaterialized Broadcasted objects should be iterable so that you can pass them to anything expecting the iteration protocol.

(Right now, we don't have a single-argument foreach function, but it makes sense to me to have foreach(itr) be equivalent to foreach(identity, itr).)

Ah, yes of course. I'm blinded by sitting too close to the code. Broadcasteds aren't (yet) iterables, but let's make them so.

That said, we do have a zero-argument foreach:

julia> foreach(()->println("hello"))
hello

It's just the degenerate 0-arg case in a vararg, calling f once with no arguments. With an iterable bc::Broadcasted, I'd spell this foreach(identity, bc).

I think having Broadcasted objects be iterable makes a lot of sense regardless of this particular use case. However, having identity as the first argument of foreach is a little strange in my opinion. In the case of f.(x, y, z) I'd expect the f to be the first argument. I guess you just have to use something like the identity version if you need full support for recursive broadcasting, but maybe there's a way to make the non-recursive case more visually appealing? In this case, you could also have

foreach(args -> f(args...), @lazy tuple.(x, y, z))

which, although not that pretty, at least puts the f in the 'right' position. If there were a way to just have f as the first argument, that would be preferable in my view, as it would work nicely with do syntax.

I guess maybe with the new tuple destructuring, do syntax wouldn't be so bad:

foreach(@lazy tuple.(x, y, z)) do (a, b, c)
    println(a + b * c)
end

There is nothing preventing us from implementing a single argument version of foreach for Broadcasted objects. Or a different function. That issue is far down on my priority list because it requires no language changes or breakage.

The first question is how to get a non-materializing broadcast; I’m partial to the for syntax because it is terse and reminiscent of generator syntax. But it’s far from the only choice.

The other prerequisite is to make Broadcasted objects iterable.

The other prerequisite is to make Broadcasted objects iterable.

Seems vaguely reminiscent of https://github.com/JuliaLang/julia/issues/18618? And since we're opting towards defining broadcast in terms of iteration (https://github.com/JuliaLang/julia/pull/26435, akin to map), this seems reasonable. Is there anything I'm missing (e.g. do we need triage this issue)?

There are three different aspects of Broadcasted's "iterability." Internal implementation, exposed API, and supported arguments. My current plan is to:

  • Base the Broadcasted internal implementation on indexing — status quo
  • Support an iterable API (iteration over Broadcasted) simply by proxying to getindex — that's #26987.
  • Support iterable arguments simply by collecting them before use — after 0.7 deprecations are removed.

While it feels like we should be able to refactor Broadcasted to work purely based upon iteration alone (and not indexing), I think it'd be really hard and possibly impossible for some combinations of iterable arguments. Doing so performantly is even worse. We'd have to very carefully structure the order of iteration and nested loops so we never need to backtrack up dimensions. Also it _would change the semantics of broadcast_. For example, with a pure iteration implementation, in f.(1:3) .+ g.([1 2]) .+ h.() I'd expect f to execute three times, g to execute twice, and h to execute once. As it stands, all three functions execute 6 times. Folks rely on this behavior for things like A .*= rand.(). So I'm content to leave this alone.

Now that broadcasted objects are iterable, we should bikeshed a syntax for a non-materializing dot call. Just to start things off, maybe a non-binding straw poll of some non-breaking options (which eliminates f..(x)):

  • :thumbsup: for for f.(x) (much like the current x for y in z generator syntax)
  • :tada: for .(f.(x))
  • :heart: for @lazy f.(x) (collides with Lazy.jl) or @lazy f(x) (acting like @.)
  • 😕 for none of the above

wouldn't for f.(x) collide with the syntax for an actual for loop?

@cossio, no, it would be handled much like the generator syntax. Just like you can currently do sum(x^2 for x in x), you could do sum(for x.^2).

And just like a generator, you would disambiguate in other circumstances (as needed) by using parens, e.g. we currently do x = (y.^2 for y in 1:3), and similarly you could do x = (for y.^2).

sum(for x.^2) and x = (for y.^2) are currently syntax errors, so there would be no breakage.

It doesn't seem to me that (for f.(x)) is a good use of for: in f(x) for x in X, that keyword is used before the variable representing a single entry in the input, but here it would appear before the transformation instead. That's not really logical. If anything, it should rather be (f.(x) for), which is of course weird.

At least @lazy would be obvious, with the drawback that it's verbose. We could also use @view/@views, as the result would be a kind of view of the broadcast operation (and you're likely to use array views at the same time as lazy broadcast if the expression also involves indexing).

I also kind of like the idea of sum:(f.(x)) or sum..(f.(x)), but ..(f.(x)) looks funny in isolation and :(f.(x)) is already taken, so unfortunately it's not really a solution to create lazyBroadcasted objects in general.

With the for f.(x) syntax I was thinking back to #2675: if you write for X rather than for x in X, it is pretty natural to interpret X as an iterator.

I agree that @lazy has a comprehensibility advantage. One option that might help with its verbosity would be to make @lazy act like @. in implicitly adding dot calls. That is, you would just write sum(@lazy f(x^2)) etcetera and it would "add the dots" for you.

sum:(f.(x)) and sum..(f.(x)) would be breaking changes, so they are non-starters unless we want to wait for Julia 2.0.

By the way, should broadcasted now be exported (to match broadcast, for people who prefer the functional interface)? Should it be exported from Broadcast but not Base (to match Iterators)?

One potential downside for @lazy, is that it's really easy to imagine situations where mixed dotted and lazy calls are wanted, and this syntax choice still punts on that.

I think f.:(x) is available though, isn't it?
.: is a cut little pattern of 3 dots.
Colon has a kind of natural "The thing following this is a list` impression,

I think f:.(x) is available too, though I fnd that harder to read because the colon is not next to the open bracket.

Is it used in something like Base.:(=)(a::Foo, b::Foo) = a.bar == b.bar?
I feel like it might be, but if so, I think it is non-ambiguous being that it is in a RHS value,

@oscardssmith, if @lazy adds dots to function calls (like @.) you could define it so that literal dot calls are not lazy. i.e. if you did @lazy f(g(h.(x))) then the f and g calls would be fused and lazy, but h.(x) would be an ordinary dot call.

And like @. you could splice calls that you don't want to be affected, e.g. sum(@lazy sqrt($sort(x))) would call sort in the ordinary way.

The other nice thing about @lazy is that it could be implemented now, in a package, and folded into Base in 1.1. (The only annoyance there is the conflict with Lazy.jl. What would the package be called?)

What would the package be called?

Experimental/Future modules have been discussed for this purpose IIRC. This functionality seems like a perfect candidate for such? Best!

Another option could be to reconsider @cossio's earlier comment, i.e. to make the lazy version be the default and to require a collect to materialize. In my opinion, this is the cleanest solution because it requires no new syntax or macros, but of course it's a big change.

By the way, should materialize just be methods of collect at this point?

collect might be the wrong function here, since it is defined to always return an Array while broadcast is defined to return a scalar (2 .+ 1), tuple ((1, 2) .+ (3, 4)) or an Array, with the possibility of other custom types as well.

Making the lazy version the default would be a huge breaking change. Not going to happen prior to Julia 2.0.

Based on the idea in @dawbarton's _lazy function used for implementing @lazy macro (see https://discourse.julialang.org/t/19641/20), I just noticed that the Broadcasted can be extracted simply with a dot call, even without a macro:

"""
    air.(broadcasting_expression)

Broadcast without materialization.
"""
air(::Any) =
    error("Function `air` must be called with a dot as in: ",
          "`air.(YOUR .+ BROADCASTING .* EXPRESSION ./ HERE)`")

struct Aired{T}
    value::T
end
Broadcast.broadcasted(::typeof(air), x) = Aired(x)
Base.materialize(x::Aired) = x.value

then:

julia> air.(sin.(1:3))
Base.Broadcast.Broadcasted(sin, (1:3,))

Having said that, a shorter syntax for doing this would be nice...

Yes, though the reason I went for a macro rather than a dot call was because I don't like the obscuring of the meaning of the dot call. For some dot calls to broadcast and others to do something different (e.g., return the Broadcasted object) just seems to be a recipe for confusion/unclear code. Hence my preference for the macro which clearly indicates something different is going on.

This would be nice to have, and these ideas of messing with broadcasting are clever.

For summing these (as in the discourse thread above) note that they aren't as fast as they might be:

julia> using BenchmarkTools, LazyArrays

julia> a = rand(10_000); b = rand(10_000); c = b';

julia> @btime sum($a .+ $b);
  4.263 μs (2 allocations: 78.20 KiB)

julia> @btime sum(air.($a .+ $b)); # or @lazy $a .+ $b
  8.733 μs (2 allocations: 48 bytes)

julia> @btime sum(BroadcastArray(+, $a, $b)); 
  1.471 μs (3 allocations: 80 bytes)

Regarding reductions of air-style lazy broadcasts, I see four (edit: 5) problems/questions:

  1. When do we check for DimensionMismatch? Current air checks on access, but there would be an argument for eagerly checking shapes.
  2. Can we also avoid allocation of the tiny Broadcasted object? This requires a little massaging. Afaict the Broadcasted object is only used on the error paths, so maybe the optimizer can fix this for us? (goal: new statements that are only used in error paths (inverse-dominated by a throw) get duplicated and deferred into the error path)
  3. Type stability (the issue of empty reductions). Our goal is that it is easy for users to write reductions that perform well, and eltype is not available.
  4. Simd.
  5. Reductions over certain dimensions, like sum(air.(a.*b), dims=1)

Regarding performance, the air variant performs pretty well for some custom reductions. We could probably get away with just specializing the most important cases for Broadcasted-objects.

julia> air(::Any) =
           error("Function `air` must be called with a dot as in: ",
                 "`air.(YOUR .+ BROADCASTING .* EXPRESSION ./ HERE)`");
julia> struct Aired{T}
           value::T
       end
julia> Broadcast.broadcasted(::typeof(air), x) = Aired(x)
julia> Base.Broadcast.materialize(x::Aired)=x.value

julia> function ndot(a,b)
       length(a)==length(b)>0 || error()
       @inbounds s=a[1]*b[1]
       @simd for i=2:length(a)
       @inbounds s += a[i]*b[i]
       end
       s
       end;
julia> Base.@propagate_inbounds function airsum(x)
       s = x[1]
       @simd for i=2:length(x)
       @inbounds s += x[i]
       end
       s
       end

julia> airdot(a,b) = airsum(air.(a.*b));
julia> airdot_ib(a,b) = @inbounds airsum(air.(a.*b));
julia> airdotsum(a,b)=@inbounds sum(air.(a.*b))
julia> a=rand(10_000); b=copy(a);
julia> begin
       @btime ndot($a,$b);
       @btime airdot($a,$b);
       @btime airdot_ib($a,$b);
       @btime airdotsum($a,$b);
       end;
  3.784 μs (0 allocations: 0 bytes)
  3.798 μs (2 allocations: 48 bytes)
  3.786 μs (0 allocations: 0 bytes)
  18.308 μs (2 allocations: 48 bytes)

Here is my attempt along the lines of airsum, which also treats sum(air.(a .* b')):

julia> function bsum(bc)
           T = Broadcast.combine_eltypes(bc.f, bc.args)
           tot = zero(T)
           @simd for I in eachindex(bc)
               @inbounds tot += bc[I]
           end
           tot
       end;

Is there a nice way to make sum(air.(a .* b'), dims=1) work? Perhaps this can be number 5 on your list.

(Edit: shortened)

Maybe the right way to do this is for air.(...) to materialize into something <: AbstractArray. And the simplest way to do that would be to re-use the BroadcastArray, which I tried in this PR: https://github.com/JuliaArrays/LazyArrays.jl/pull/21

Can we use a macro @: for non-materializing broadcast? It seems this is syntactically invalid at the moment. But I suppose making it valid is not breaking? It's nice since it's short and does not introduce a new concept in the syntax (I think?).

I created a patch #31088 to construct a broadcasted object with @: to see if adding it does not break the current syntax in Julia. Combined with #31020, we can do sum(@: x .* y) to compute the dot product which is almost as fast as LinearAlgebra.dot but is more flexible.

Thanks for pushing this forward!

Regarding the syntax, I'm not sure I like @:. It looks a bit like @., yet contrary to it it doesn't turn function calls into dot-calls: it only affects existing dot-calls. And : isn't related to broadcasting in Julia, but to ranges. Above it was suggested to use @lazy, which is more explicit. I've also proposed adding this meaning to @views, given that a Broadcasted object is a kind of view on its inputs (i.e. it avoids copies and modifying the inputs will affect the result).

Thanks for the comments!

And : isn't related to broadcasting in Julia, but to ranges.

I'm seeing @: to be related to quotes rather ranges. That is to say, :(expression) "guards" against evaluation of the expression and @:(expression) "guards" against materialization. @: has two dots in it so it's not hard to relate to the dot calls.

Above it was suggested to use @lazy, which is more explicit.

I think explicit is great but I don't think long means good. For example, lowered form of broadcasting expression is quite long but it's harder for humans to understand. I think @: is concise enough to not distract the important part of the code but yet salient enough to visually signify the "unusual" part.

Also, I think @lazy is too general term because there are too many lazy things in programming concepts. It's already noted that the name crashes with the one from Lazy.jl. It's better to have more specific notation.

I've also proposed adding this meaning to @views, given that a Broadcasted object is a kind of view on its inputs (i.e. it avoids copies and modifying the inputs will affect the result).

I think this is a great idea because it can be applied to multiple broadcasting expressions in a block. At the same time, it's also nice to have a single-purpose macro like @view.

This is great, thanks for tackling it @tkf! On the name, I find @: to be quite clever. Perhaps it is too clever by half, but I do like it. Adding this behavior to @views would unfortunately be breaking — lots of folks use @views in conjunction with dot fusion to work around the lack of a broadcasted getindex, and this would drastically change the return type to something that's no longer even an AbstractArray.

Some ideas:@,, @gen, @...

FWIW, the three macros in that family @:, @., @views play well together: sum(@: @. @views a[I] * b[I]) works as expected in any of the six permutations.

Adding this behavior to @views would unfortunately be breaking

Good point. It has to be a new macro if we were to do this. But people can experiment with the interface in libraries if we have good supports for lazy broadcast.

Another motivation for this notation is that it is useful for defining "broadcastable call overload": https://discourse.julialang.org/t/best-practice-for-broadcastable-callables/21000

The new syntax for f.(args...) proposed by triage https://github.com/JuliaLang/julia/pull/31088#issuecomment-473030504 is implemented in #31553

Another solution is to implement an ArrayReducer struct which implements the reducing operation, but acts as if it had an appropirate size. See https://discourse.julialang.org/t/is-there-something-like-broadcast-mapreduce/6076/15 (I think this is very clever, is that you @meggart?). Then you can just call the ordinary broadcast! on it. I think it's an option worth considering.

We "just" have to decide the syntax (which is probably always the hardest part...). Unfortunately, it seems that there is no consensus in Triage https://github.com/JuliaLang/julia/pull/31553#issuecomment-479758476. I tried to push the discussion by implementing proof-of-concepts but I'm "out of bullets" ATM.

Regarding the _implementation_ of broadcasted reduction, there is already #31020. I believe specializing mapreduce (and mapfoldl) is the right direction to go (rather than "hacking" broadcast! as done in ArrayReducer, IIUC).

I would second that. The ArrayReducer hack was done before the big 1.0 broadcast changes, which solved my immediate problem so I agree that it would not add anything significant here, and in particular #31020 is much more powerful.

Good luck anyway in finding a syntax for the lazy map, I would be eager to have it in Julia Base as well.

I think we do want something here, but this is a pretty deep feature and I think it doesn't hurt to let the idea simmer in the collective consciousness for a while. I'm just writing this to make it clear that people aren't against this, nor are we stonewalling, this is just the kind of feature that requires some significant rumination so that you don't end up baking in something that is good but not the best.

Was this page helpful?
0 / 5 - 0 ratings