Julia: sum(i for i in Vector{Int}()) does not return 0

Created on 25 Jun 2018 · 28Comments · Source: JuliaLang/julia

sum(i for i in Vector{Int}()) raises an error instead of returning 0. The sum of an empty list is zero (mathematically). Moreover in this case the return type (Int) is also clear. Therefore I don't see why this should not return 0 (Int). Of course the same applies to any number type.

Source

cossio

Most helpful comment

This is something that I often trip over as well. +1 for resolving it.

I would be quite comfortable putting much of the burden on the user by prescribing and init keyword argument as for reduce. I.e. if I anticipate that a collection might be empty, then I will call

sum(i for i in Vector{Int}(); init=0)

cortner on 22 Jul 2019

👍4

All 28 comments

Related to #18695 and #18873 (generators don't have an eltype).

The sum of an empty list is the additive identity ("zero"), but the value of the additive identity may not be 0 in general. For example, if sum an empty list of 2×2 integer matrices, the result should be a 2×2 matrix of 0. But I agree that in this case, in principle, the correct type of zero could be inferred.

stevengj on 25 Jun 2018

I guess it should return "zero" (of type T, if that is a matrix or whatever) whenever the type T can be inferred.

cossio on 25 Jun 2018

"For empty collections, it may throw an error or return zero(T), with T being the element type of the collection, depending on the state of the world." I don't think I like that kind of semantic. It's too likely to lure you into believing it works today just because it worked yesterday.

martinholters on 25 Jun 2018

Since sum(Int[]) === 0 it would not be crazy for this to do the same.

StefanKarpinski on 25 Jun 2018

👍2

The root problem here is

julia> eltype(i for i in Vector{Int}())
Any

Inference works in this particular case:

julia> g = (i for i in Vector{Int}())
Base.Generator{Array{Int64,1},getfield(Main, Symbol("##7#8"))}(getfield(Main, Symbol("##7#8"))(), Int64[])

julia> Base.return_types(g.f, Tuple{eltype(g.iter)})
1-element Array{Any,1}:
 Int64

but it may be tricky to use this information without making sum(::Generator) type-unstable in general.

stevengj on 25 Jun 2018

👍3

Indeed. See also discussion about element type of generators at https://discourse.julialang.org/t/can-eltype-deduce-the-element-type-of-a-generator/6429.

nalimilan on 25 Jun 2018

We do generally make a best effort for empty collections based on inferred return types though.

StefanKarpinski on 25 Jun 2018

The problem here is that we'd need zero(Any) for the case where inference doesn't succeed in finding a tight bound.

martinholters on 25 Jun 2018

Sure but why can't the error only happen in that case?

StefanKarpinski on 25 Jun 2018

👍2

Because that would probably make people rely on inference.

Imaginary bug report:
"I have bar(n) = sum(foo(x) for 1:n), and bar(0) used to work, but since I updated my dependency Foo to v0.34, it throws an error. Interestingly, the error goes away if I comment out using Bar in my code"
Reply:
"That's by design. Your function foo is type-unstable and calls bork with a parameter that is inferred as Any. Now, bork returns an Int, which can be inferred and all is fine. But Foo (since v0.34) and Baralso define methods of bork. These, too, return an Int, but inference doesn't know which method is called and there are too many now to infer them all, so the return type is widened to Any."

That does not sound too attractive to me. (Also, figuring all that out probably cost someone a good amount of time.)

Making the element type of empty arrays depend on inference already was a measure of last resort. Deciding whether to throw an error or not depending on inference seems way too risky to me. Although I admit it will work most of the time, encouraging people to write code that works _only most of the time_ is questionable.

martinholters on 25 Jun 2018

👍2

That is true in every case were we use return_type to determine what to do in the empty collection case. Is the argument that this is worse because it's an error whereas choosing an element type doesn't necessarily cause an error—although it might if your code was previously inserting a value where a change to inference causes a tighter type to be inferred.

StefanKarpinski on 25 Jun 2018

Yes. Chances are that the element type of an empty collection does not matter at all. Furthermore, the element type for the empty case may only be wider than in the non-empty case, so inserting an element should usually not be trap. An example where you may get a sudden error would be sum(foo.(x)) for empty x (or sum(collect(foo(i) for i in Int[])), if you try to workaround the issue discussed here).

So for the collect/broadcast reliance on return_type in the empty case, there is a high chance of a change in the inference to Any going unnoticed/only causing a performance penalty. For sum, that chance would be 0.

IMO, a much saner approach would be to let sum take a v0 (or init, ref #27711) parameter one can pass in situations like this: sum(i for i in Vector{Int}(), init=0).

martinholters on 26 Jun 2018

👍2 ❤1

Maybe sum can take a type parameter that does two things:
1) ensures that all elements encountered are of this type (otherwise throws an error)
2) returns zero of this type if the iterator is empty

cossio on 26 Jun 2018

I guess it should return "zero" (of type T, if that is a matrix or whatever) whenever the type T can be inferred.

Even if the type can be inferred, you may not be able to create an appropriate zero since array types don't carry their sizes with them. What is

out = sum(i for i in Vector{Matrix{Float64}())

? I mean, what is size(out)?

pkofod on 26 Jun 2018

Not being able to conjure a zero of some type is a different issue than not knowing what type of zero one wants. That can be a failure of zero(T) instead of a failure of sum.

StefanKarpinski on 26 Jun 2018

To follow up on the lines of thought hinted at by @martinholters and @StefanKarpinski: Perhaps the real problem is the lack of a generic representation of additive identity, which is a well-defined concept. Would it be crazy to have a singleton type to represent this concept in cases where the type is abstract? In relevant operations it could just promote to the appropriate concrete type.

benninkrs on 27 Jun 2018

We already have that: Bool — false is a "strong zero" and true is a "strong one" in the sense that false * x == zero(x) and true * x == x. We could potentially return false in these cases and it wouldn't be too bad since convert(T, false) is defined to give you zero(T) in general.

StefanKarpinski on 27 Jun 2018

The Bool has additional semantics not carried by zero (you can use the Bool in an if, but not the zero). Therefore returning false does not seem consistent.

cossio on 27 Jun 2018

We'd need a "strong zero" in the additive sense, and false + x == x is deprecated for arrays x and probably fails for unitful x.

martinholters on 27 Jun 2018

The idea of a strong zero seems analogous of I, the UniformScaling representing an identity matrix of any size.

cossio on 27 Jun 2018

It seems inconsistent that prod(()) == 1 is fine but sum(()) is an error. Either it returns zero (following prod), or prod(()) should also be an error, IMO.

cossio on 17 Dec 2018

This is something that I often trip over as well. +1 for resolving it.

I would be quite comfortable putting much of the burden on the user by prescribing and init keyword argument as for reduce. I.e. if I anticipate that a collection might be empty, then I will call

sum(i for i in Vector{Int}(); init=0)

cortner on 22 Jul 2019

👍4

That's a nice idea, @cortner 👍

StefanKarpinski on 22 Jul 2019

Would it be crazy to have a singleton type to represent this concept in cases where the type is abstract?

FYI I created https://github.com/tkf/InitialValues.jl to implement this approach for general foldl and reduce. I find it handy if you want to handle "zero" for generic operations.

But, in this case, I'm +1 for sum(...; init) as suggested by @martinholters and @cortner. In general, I think it makes sense to add init for any function that is a thin wrapper of foldl or reduce.

tkf on 22 Jul 2019

👍2

36188 adds `init` to `sum` etc.

tkf on 8 Jun 2020

👍2

36188 is merged so we can use `init` in `sum` now. Can this issue be closed?

tkf on 11 Jun 2020

Thanks @tkf! Will it make it into 1.5?

cossio on 12 Jun 2020

We already had 1.5 feature freeze like a few weeks ago, unfortunately. So, it'll be in 1.6.

tkf on 12 Jun 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Broken booleans as numbers?

TotalVerb · 3Comments

add @callsuper macro

StefanKarpinski · 3Comments

Should `where` be a keyword?

yurivish · 3Comments

Dates.parse(::AbstractString, ::DateFormat) removed without deprecation

omus · 3Comments

isposdef() is incorrect

wilburtownsend · 3Comments

Julia: sum(i for i in Vector{Int}()) does not return 0

Most helpful comment

All 28 comments

36188 adds init to sum etc.

36188 is merged so we can use init in sum now. Can this issue be closed?

Related issues

36188 adds `init` to `sum` etc.

36188 is merged so we can use `init` in `sum` now. Can this issue be closed?