Julia: make numbers non-iterable?

Created on 8 Aug 2014  ·  75Comments  ·  Source: JuliaLang/julia

@StephenVavasis has pointed out some rather confusing behavior of the in operator, including:

julia> VERSION
v"0.3.0-rc2+12"

julia> x = IntSet([3,5])
IntSet([3, 5])

julia> in(3,x)
true

julia> in(x,3)
false

julia> in("abc",19)
false

julia> in(19,"abc")
false

Worse still is this:

julia> 97 in "abc"
true

This issue is to discuss what, if anything, we can do to reduce some of this confusion.

breaking decision help wanted

Most helpful comment

My experience in trying to implement even a small piece of this pre-1.0 (#7903) leads me to believe that changing this would lead to a huge amount of code churn over the whole ecosystem. i.e. it wouldn't be worth it without huge benefits, which I haven't seen anyone articulate beyond "slightly confusing to some newcomers".

All 75 comments

x in y is pretty simple: does x == any element iterated by y.

The issue is not specific to in; 97 == 'a' is true.

I do think it would be good to make Char not an integer type. In theory having numbers not be iterable might be ok, but I think it will be very annoying in practice.

see also #5844

So I guess the problem is not specifically with `in' but with ==. I raised the issue that apparently meaningless uses of 'in' do not generate errors because I made silly blunder with 'in' in my project that did not generate an error message. For a similar reason (ease of debugging) the following uses of == should also generate errors, no?

julia> [3,4,5] == ["x","y"]
false

julia> IntSet() == ["a"=>7]
false

julia> 9 == Int64
false

julia>

Our == is total, defined on all pairs of values. I find this convenient,
and I think it is quite difficult to decide exactly which cases to exclude.
On Aug 7, 2014 8:15 PM, "StephenVavasis" [email protected] wrote:

So I guess the problem is not specifically with `in' but with ==. I raised
the issue that apparently meaningless uses of 'in' do not generate errors
because I made silly blunder with 'in' in my project that did not generate
an error message. For a similar reason (ease of debugging) the following
uses of == should also generate errors, no?

julia> [3,4,5] == ["x","y"]
false

julia> IntSet() == ["a"=>7]
false

julia> 9 == Int64
false

julia>


Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/7903#issuecomment-51549500.

Jeff,

I guess it's OK to make == total, but there also ought to be a restricted
version of == that requires the two operands to have the same type in
order to catch programmer blunders. I would guess that it is rare among
scientific codes to have a need for == with unequal types. Is there such a
restricted version available in Julia? (And is there a similarly
restricted version of 'in'?)

-- Steve

On Thu, 7 Aug 2014, Jeff Bezanson wrote:

Our == is total, defined on all pairs of values. I find this convenient,
and I think it is quite difficult to decide exactly which cases to exclude.
On Aug 7, 2014 8:15 PM, "StephenVavasis" [email protected] wrote:

So I guess the problem is not specifically with `in' but with ==. I raised
the issue that apparently meaningless uses of 'in' do not generate errors
because I made silly blunder with 'in' in my project that did not generate
an error message. For a similar reason (ease of debugging) the following
uses of == should also generate errors, no?

julia> [3,4,5] == ["x","y"]
false

julia> IntSet() == ["a"=>7]
false

julia> 9 == Int64
false

julia>


Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/7903#issuecomment-51549500.


Reply to this email directly or view it onGitHub.[7881863__eyJzY29wZSI6Ik5ld3NpZXM6QmVhY29uIiwiZXhwaXJlcyI6MTcyMzA3NzY0NywiZGF0YSI6eyJ
pZCI6MzkxNDA5MjB9fQ==--26abb0bc2f6cb83bf6826a0450d494ff66b93a18.gif]

Three =:

julia> 5 == 5.0
true

julia> 5 === 5.0
false

However that is for object identity, not equality. I.e. (zeros(3) ===
zeros(3)) is false

On Thursday, August 7, 2014, Tim Holy [email protected] wrote:

Three =:

julia> 5 == 5.0true
julia> 5 === 5.0false


Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/7903#issuecomment-51553058.

One option is you can pick your favorite unicode equality-resembling symbol from this list https://github.com/JuliaLang/julia/blob/31eb8d432bc69f617091fbc0406d3aab442360e9/src/julia-parser.scm#L13 that is parsed with the same precedence as == but open for user definitions, and make that operator error on inputs of different types.

Earlier I wrote that it seems OK for == to work for all possible operands, but now I changed my mind. The (small) increase in expressive power does not offset the potential for enabling programmer blunders. One important mission of a programming language is to help prevent the programmer from shooting himself/herself in the foot, and in this case Julia needlessly fails to close a loophole.

About 25 years ago when I was a CS prof at Cornell, the issue came up (again) whether to switch our introductory programming course to C, and our faculty unanimously rejected the idea (again) for many reasons. One of the reasons was that we did not savor the possibility of undergrads lined up outside the TA office for help with their assignments because they wrote
"if (x=0)" in their homework (instead of "if (x==0)") . The same rationale still applies to Julia 25 years later; it should not be possible to compile a program with in(x,3) where x is an IntSet, nor a program with x==t, where x is an Int and t is a Dict.

It seems it would be better to add restrictions to in than to ==. For example, in could require that the second argument is a collection rather than a scalar.

I also agree that Char shouldn't be considered as an integer type. There's Uint8 for this use case.

I find the case for restricting in much more convincing than for ==.

With ==, people will write code to look for a certain value, like x == 0. In a very generic context, you might not know what x could be. This applies even more for sentinel values, e.g. x == :none. It would be too much of a gotcha to require also checking that the comparison will be valid, e.g. if isa(x,Symbol) && x == :none. Also imagine a Dict{Any,Any}, where arbitrary keys need to be compared.

In other words, it would be too difficult to get back the current behavior. You would need something like if comparable(x,y) && x==y, but it's not clear how to write comparable. In contrast, it is currently fairly easy to add extra restrictions if you want.

Another (unrelated) issue in() seems to have is that in(x, s::Set) calls haskey to check equality (see set.jl:16), ignoring user-defined == and isequal.

Please let me know what you think about this, and feel free to move this elsewhere if appropriate.

Here's an example:

julia> VERSION
v"0.2.1"

julia> immutable Edge # edges of an undirected graph
       a :: Integer
       b :: Integer
       end

julia> ==(a::Edge, b::Edge) = (a.a == b.a && a.b == b.b) || (a.a == b.b && a.b == b.a)
== (generic function with 47 methods)

julia> import Base.isequal

julia> isequal(a::Edge, b::Edge) = a == b
isequal (generic function with 34 methods)

julia> edges = [Edge(1,2), Edge(2,3), Edge(3,1), Edge(1,3)]
4-element Array{Edge,1}:
 Edge(1,2)
 Edge(2,3)
 Edge(3,1)
 Edge(1,3)

Now, compare

julia> in(Edge(2,1), edges)
true

and

julia> in(Edge(2,1), Set(edges))
false

I did not expect this.

This also breaks unique (try unique(edges)), which uses in(x, s::Set) internally.

In this particular case a simple workaround is to define the (inner) constructor

Edge(a::Integer, b::Integer) = a < b ? new(a,b) : new(b,a)

and use default == and isequal, but a solution like this one may not be possible in for other user-defined types. (This reminds me of a discussion of the relationship between "a == b" and "hash(a) == hash(b)".)

PS: As far as I can tell recent code in the master branch should show the same behavior.

You also need to implement hash for sets and dicts to work properly with user-defined ==.

Hm, perhaps Set should be called HashSet to preserve the mathematical meaning of "Set"? Overall, the mathy types aren't fully internally consistent, IMO.

Could you elaborate on what is a "mathy" type, and what the problems are?

Let me make the following proposal: instead of redefining ==, how about if you redefine isequal(.,.) so that it is valid only when its operands are the same type (of course extensible by the programmer if necessary). Then you redefine 'in' to apply isequal instead of ==. Finally, you make it clear in the documentation that isequal(.,.) is specifically intended for use with containers (the documentation already says that it is useful for sorting). Furthermore, containers that use isequal might impose additional restrictions on keys. For example, Dict() and Set() impose the restriction that isequal(a,b) implies isequal(hash(a),hash(b)). For the containers I am developing in my current project (OrderedDict, MultiMap and OrderedSet), the restriction is that isequal(x,y) is true if and only if isless(x,y) and isless(y,x) are both false (i.e., isless defines a total order on the keys).

Isn't == defined in terms of isequal()? Forcing same type means you lose the ability to compare numbers of differing types but the same numeric values as equal.

isequal is specifically for hashing, which has to support cross-type comparisons, so that's a nonstarter.

We have actually put a lot of time and thought into equality, and we've tried a couple variants. Having too many different kinds of equality (common lisp has at least 5) gets difficult to manage. Stefan developed a clever hash function that's able to efficiently hash equal numbers equally, even if they're of different types. Since then we've enjoyed the luxury of == and isequal being identical except for NaN and -0.0. isequal used to have various different behaviors, but at this point it's a bit difficult to remember what they were, since life is so much simpler now.

Steven,

It would actually be pretty trivial to implement what you're after if you'd
like it for you own code at least. I admit that though at first realizing
different types could be equal was a little jarring, it's made a difference
surprisingly little in actual code I've written. My two cents.

typesequal{T}(x::T, y::T) = x == y
typesequal(x, y) = error("Types of $x and $y don't match")

-Jacob

On Sat, Aug 9, 2014 at 10:00 PM, Jeff Bezanson [email protected]
wrote:

We have actually put a lot of time and thought into equality, and we've
tried a couple variants. Having too many different kinds of equality
(common lisp has at least 5) gets difficult to manage. Stefan developed a
clever hash function that's able to efficiently hash equal numbers equally,
even if they're of different types. Since then we've enjoyed the luxury of
== and isequal being identical except for NaN and -0.0. isequal used to
have various different behaviors, but at this point it's a bit difficult to
remember what they were, since life is so much simpler now.


Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/7903#issuecomment-51703834.

Jacob,

I understand that I could implement this myself, but that is not the point I'm trying to make. I'm trying to say that Julia is flawed if the statement in(x,3) does not cause an error message, where x is an IntSet, because it is failing its mission to catch programmer blunders. (See my earlier posting on this thread.)

I am working on a Julia project and would like to 'buy in' to Julia, but obviously I am concerned that it might not catch on, especially at the level of university instruction, if it has flaws like this.

Furthermore, I think that this conversation reveals a problem with Set and in(): they are trying to solve too many problems at once. The application of Set mentioned by Constantine in an earlier message in this thread is more likely to appear in a scientific code than an application in which Set holds objects of varying types. Therefore, either Set should be redesigned to be more appropriate for Constantine's example, or else it should be split into two separate container types, say ArbitrarySet and TypedSet.

Blowing this one relatively minor issue up to the success or failure of the entire project is a bit hyperbolic. But yes, it would be nice to catch common usage mistakes better here – that's why I opened the issue to discuss it. Making isequal error on different types is not good, but I think that throwing an error for x in C unless isa(x, eltype(C)) might be acceptable. Unfortunately, we can't rely on all possible iterable things having an eltype method, so you can't just do that. You can do something like what collect does where you check applicability of eltype first and only do the test if it's applicable. I know Jeff hates that kind of thing though.

Checking isa(x, eltype(C)) is a decent idea. A while ago we added a fallback for eltype so it is always at least Any. It's difficult to express just how in should be restricted, if at all, so this suggestion has merit.

However, in my view it's always at least valid to _ask_ whether an item is in some collection. It shouldn't be ok to ask 1 in {1.0,2.0}, but not 1 in Float64[1.0,2.0].

Python doesn't even have typed containers (except in numpy), and it has certainly caught on for university teaching. It deals with x in 3 by making integers not iterable, which is a reasonable choice.

What happened in julia is that integers can be used as indexes, of course, and we describe indexing as iterating over all of the indexes. Therefore integers became iterable. The rest is a natural consequence. If I had a value n that I knew was being used as an index, and I wanted to know whether it would touch position i, I could ask i in n. In that case i in 3 makes sense. So these things are not necessarily random mistakes, but tend to be consequences of certain very general ideas.

Therefore any realistic "fix" for these things needs to go back and revise the underlying ideas. For example, we could entertain a proposal where numbers are not iterable, and indexing works on some other principle. Maybe scalars should be promoted to 0-d arrays before being used as indexes, or maybe we iterate over not an index itself, but over an iterator returned by toindex(x), or something of that ilk. These approaches have so far struck me as more complex, and possibly less efficient, than just making numbers iterable. But I'm open to suggestions.

Stefan,

Yes, I agree, I exaggerated the significance of one issue. But it is not an exaggeration to say that preventing programmer blunders is a more important goal for a programming language than maximizing generality.

Second, I would like to point out that the following test appears to indicate that Julia is following the isequal rule rather than the == rule for set membership:

julia> s = Set([5.0, -0.0])
Set{Float64}({-0.0,5.0})
julia> in(5.0,s)
true
julia> in(0.0,s)
false
julia> 0.0 == -0.0
true
julia> isequal(-0.0,0.0)
false

Finally, let me put forward yet another proposal for fixing this problem: make an additional, more complicated constructor for Set that takes a function, like this:

Set(my_isequal, [ <initial entries> ])

or perhaps even two functions:

Set(my_isequal, my_hash, [<initial entries>])

C++ lets you do something analogous with containers. Then IntSet can behave like a Set where my_isequal is specified and would accept only Int arguments.

The in function can use the my_isequal associated with the container that is its second argument.

[edit formatting: @StefanKarpinski]

Sets and Dicts with custom comparison and hash functions would be a fine feature to have.

Sets and Dicts with custom comparison and hash functions would be a fine feature to have.

I disagree. Getting the comparison and hash functions right is incredibly hard. If someone wants to do some different kind of comparison, the way to do it is to transform the keys explicitly before hashing or storing them and then use the normal comparison and hashing. This is simpler, more explicit, and doesn't fail in subtle and confusing ways, which is what would happen with custom comparison and hashing functions.

I agree that transforming keys is better than using a custom comparison function and I would tend to steer people towards that, but I'm not totally against the custom comparison approach. Maybe it could go in a package, along with things like OrderedDict.

A subtler check for the element type is similar to what we do to check if a key is value for a typed Dict – check that isequal(x, convert(eltype(C), x)). This catches both the IntSet([3,5]) in 3 and the "abc" in 19 cases. It doesn't help with characters being considered integers, but that's a different issue. We'd want to catch no method errors for the conversion and render them as type mismatches.

Using the same check that Dict uses is a good thought. However Dicts only use this on assignment, not on lookup.

Maybe a more surgical approach is needed: only allow the second argument to in to be a Number if the first argument is too. We might also want to allow only in(::Char,::String) for strings.

Taken a bit further, this could argue for removing the fully-generic in, or adding a Collection type and having the generic in require it for the second argument.

Dicts could also throw an error if you use a nonsensical key – i.e. do that same check on lookup.

I think it still might be too strict. It's also too container-type-dependent: a Dict with only Int keys should be representable as a Dict{Int,_} without causing problems, at least in the absence of mutation.

In an earlier message in this thread, Stefan wrote that it is 'incredibly hard' to write comparison and hash functions. At least for the common cases, it seems relatively straightforward to me. In the case of IntSet, the default comparison and hash work fine except with their arguments limited to Int. In the case of Edge in Constantine's earlier message, it also seems straightforward; Constantine already showed us the isequal method in his posting, and the hash method would either hash (a,b) or (b,a), depending on whether a<=b, using the built-in hash function.

It wouldn't be incredible if it was easy to believe how hard it is ;-)

It's fairly straight forward if you're comparing values within a given type – any deterministic function can serve as a hash; as long as your equality relation is compatible with that hash, things are fine. It's when you introduce cross-type comparisons that life gets complicated. Consider writing a cross-type == for Ints and Float64s that's actually an equivalence relation. Hint: converting to Float64 and then checking equality is not transitive. Given a cross-type equivalence relation, how do you compute hash values that are consistent with that relation? This requires a type-agnostic (i.e. strictly value-based) hashing algorithm – but it also needs to be reasonably fast. Maybe these concerns are too esoteric for user-defined hashing and comparisons, but if they aren't handled, the functions will be subtly broken.

Stefan,

Indeed, it is impressive that you figured out how to implement a set that can contain elements of every possible Julia object and still have fast look-up with hashing. So I guess I retreat to my earlier proposal that there should be two set types, one concrete type like the current Set that can contain anything, and another type of set (an abstract type) whose contents are restricted according to type. This latter type of set is important for catching errors with 'in' and solving Constantine's problem. IntSet should be a concrete implementation of the latter type.

With regard to your earlier suggestion about checking for errors via the 'eltype' function, would this check take place at compile time, or at the time that 'in' is executed? It seems like the former is preferable for performance reasons.

Thanks, @StephenVavasis. The eltype check approach would semantically occur at runtime, but the check would be eliminated a lot of the time by type inference. For example, if x::eltype(C) and type inference can prove it, then the convert(eltype(C), x) is a no-op. It's harder to eliminate the isequal check, but I suspect that when the conversion is a no-op, the whole check will disappear. In these simple cases, the check is completely eliminated:

julia> ck(x,C) = isequal(x, convert(eltype(C), x))
ck (generic function with 1 method)

julia> @code_llvm ck(1, [1,2,3])

define i1 @"julia_ck;22392"(i64, %jl_value_t*) {
top:
  ret i1 true, !dbg !8590
}

julia> @code_llvm ck(1.5, [2.5, 3.5])

define i1 @"julia_ck;22405"(double, %jl_value_t*) {
top:
  ret i1 true, !dbg !8629
}

julia> @code_llvm ck('x', "abc")

define i1 @"julia_ck;22399"(i32, %jl_value_t*) {
top:
  ret i1 true, !dbg !8611
}

I guess to evaluate this, we'd need to try removing start, next, and done for numbers and see what breaks.

@mbauman, have you merged the changes you had that make array indexing not rely on this?

Yes, array indexing no longer requires iterating over numbers — but it does index into them.

I think there's another issue somewhere where I started looking into deprecating this… but quickly hit lots of comprehensions and loops like [x[i,j] for i in 1, j in 1:size(x, 2)]. I think it's fairly easy to do now, it'd just take some janitorial work.

Edit: https://github.com/JuliaLang/julia/pull/10331#issuecomment-77630144

I only found a handful of examples of that:

base/abstractarray.jl
611:hcat{T}(X::T...)         = T[ X[j] for i=1, j=1:length(X) ]
612:hcat{T<:Number}(X::T...) = T[ X[j] for i=1, j=1:length(X) ]

base/arraymath.jl
380:transpose(x::AbstractVector) = [ transpose(v) for i=1, v in x ]
381:ctranspose{T}(x::AbstractVector{T}) = T[ ctranspose(v) for i=1, v in x ] #Fixme comprehension

base/range.jl
372:ctranspose(r::Range) = [x for _=1, x=r]

After some discussion on a triage call, it's been decided that we should take a crack at this, but everyone has too many things on their plate to do it right now. So if someone wants to do this, it's fairly straightforward: delete all the iterable interface methods for numbers and see what breaks; try fixing stuff that breaks and see how bad it is. Make a WIP PR to report progress and get feedback.

As I noted in #15032, I think broadcast still needs to work for scalar arguments (since it _has_ to work for mixed scalar/array arguments anyway), but that can be preserved independent of whether scalars are iterable.

And size(x::Number) should still return ().

Made a quick attempt to see what happened. Concatenation of some array literals has issues when you try to remove start(::Number), like here https://github.com/JuliaLang/julia/blob/e0e93fc85e899105351860172d37b18f51d29183/base/base64.jl#L35 or the following line from uv_constants.jl

const uv_handle_types = [:UV_ASYNC, :UV_CHECK, :UV_FS_EVENT, :UV_FS_POLL, :UV_HANDLE, :UV_IDLE, :UV_NAMED_PIPE, :UV_POLL, :UV_PREPARE, :UV_PROCESS, :UV_STREAM, :UV_TCP, :UV_TIMER, :UV_TTY, :UV_UDP, :UV_SIGNAL, :UV_FILE]
# ...
    handles = [:UV_UNKNOWN_HANDLE; uv_handle_types; :UV_HANDLE_TYPE_MAX; :UV_RAW_FD; :UV_RAW_HANDLE]

edit: whoops, just needed

--- a/base/abstractarray.jl
+++ b/base/abstractarray.jl
@@ -169,6 +169,11 @@ function copy!(dest::AbstractArray, src)
     return dest
 end

+function copy!(dest::AbstractArray, src::Number)
+    dest[1] = src
+    return dest
+end
+
 # if src is not an AbstractArray, moving to the offset might be O(n)
 function copy!(dest::AbstractArray, doffs::Integer, src)
     doffs < 1 && throw(BoundsError(dest, doffs))

Why would this affect arrays of symbols?

Probably nothing specific to the symbol eltype, those are just needed during bootstrap. vcat needs length and copy! on its arguments https://github.com/JuliaLang/julia/blob/e0e93fc85e899105351860172d37b18f51d29183/base/abstractarray.jl#L163-L170, see my edit above for a new method that gets past that

I'm still confused; Symbols don't have length or iteration, so how could they have worked with vcat if that's the case?

Found it. collect(1) here: https://github.com/JuliaLang/julia/blob/e0e93fc85e899105351860172d37b18f51d29183/base/abstractarray.jl#L706

Another issue is https://github.com/JuliaLang/julia/blob/e0e93fc85e899105351860172d37b18f51d29183/base/strings/util.jl#L132 calling first on an integer result from search, I guess we could leave first and last alone, I was trying to get greedy.

Defining collect(x::Number) = [x] can also get through bootstrap. See https://gist.github.com/5cc35991da822df921d4743b9799eadf for a diff that bootstraps and a log of all the test failures. Turns out to be pretty deep in a lot of places.

Some of those are easy to fix. I spot:

  • There's a copy method in lu.jl that calls copy on a struct member that is literally always a BlasInt
  • A handful of uses of for i=1 in comprehensions
  • The call to copy in power_by_squaring that I have famously railed against in past threads on aliasing

Since it seems this is going to be non-trivial to remove, and it's not blocking anyone from using Julia, I'm going to reassign this for v0.6 so we can focus on real issues.

I'm fine with that.

To give an opposite standpoint I think that if a number has size(), it should also be iterable and possible to collect a number but I guess that I am too late to the party.

The need to support scalars in broadcast makes me second-guess the whole "numbers should not be iterable" argument.

I don't think broadcast actually uses iteration. I don't see any broadcast problems in the log Tony posted. It almost certainly uses size and indexing, though.

It's a really fine line to walk. It definitely feels right to remove collection-like behavior from numbers, but we also often want numbers to behave like Array{T, 0} — which is a collection.

Would it make sense to have something like an immutable ImmutableSingleton{T} <:AbstractArray{T,0} x::T end (with all the necessary methods implemented) around? Everywhere a number should behave like Array{T, 0} (except being immutable) it could then be wrapped with IIUC zero runtime overhead.

@martinholters, then to do atan2.(x,0.3) you would need atan.(x, ImmutableSingleton(0.3))? This seems crazy.

The whole point of having numbers act like Array{T,0} is to enable generic code. If you force people to explicitly convert to another type, you lose that benefit.

I think the point is that the broadcast implementation would wrap the singletons for you. Not sure how helpful it would be, but that's how I interpreted @martinholters' suggestion.

@StefanKarpinski I was just trying phrase exactly that, but you were faster.

The point would be that number would not be iterable by default, and functions that benefit from iterable numbers have to take care of that themselves, instead of the opposite. I'm doubtful about the usefulness myself, I just wanted to point out the option.

One part of this seems to be easier to settle: to consider removing both Array{T,0}-like and collection-like properties of Char separately. Following @mbauman that one is _not_ a fine line to walk, because anyway Char is not a Number type anymore and, well, why should Chars behave like Array{T,0}s at all?

(Note that broadcast no longer requires size etc. to work for numbers.)

As I wrote on the mailing list, I suspect that a lot of the need for iterable/indexable numbers should be gone now with 0.5's dot-call syntax. In the cases where you would previously have written a generic vector/scalar function, you should now just write the scalar function f(x), and then apply it to arrays A with f.(A). This is not only easier, it is also faster because it can fuse with other elementwise operations and the result can be assigned in-place with .=.

It's instructive to try to patch Base to make numbers non-iterable. I'm finding various cases where removing iterability requires much uglier code. For example:

  • In split, it calls r = search(string, splitter). If splitter is a string, this returns a range, but if splitter is a char, it returns an integer. Being able to call first(r) and last(r) in both cases makes the same code work for both.

  • In the code generation for multidimensional array indexing, it calls _nloops to generate nested loops over the indices in expressions like a[i, 3:4]. By being able to do for j in i, the same generated code can handle i::Int and i::AbstractVector{Int}. (On the other hand, this may be suboptimal, since it doesn't look at first glance like LLVM can eliminate the loop in the Int case.)

  • In the FFT code, you can pass any iterable of dimensions to be transformed. This allows you to pass a single dimension (integer) and have it be handled with the same code.

On the other hand, making numbers non-indexable (removing size, getindex, etcetera), seems much less disruptive ... it looks like almost no changes are required in Base.

The converse argument: if it is so useful to make numbers iterable, maybe everything should be iterable? i.e. just define fallback start etc. methods for Any.

In split, it calls r = search(string, splitter). If splitter is a string, this returns a range, but if splitter is a char, it returns an integer. Being able to call first(r) and last(r) in both cases makes the same code work for both.

This would be fixed by https://github.com/JuliaLang/julia/issues/10593 (see this Julep): you'd call findseq or searchseq, which would return an index range for both string and char arguments. The previous behavior returning a single index when passing a char would be obtained via findeq/searcheq (which wouldn't work to find a substring). So one less reason not to do this!

And note that we could use #19730 to wrap all numbers in a specialized AbstractArray{T,0} before they're used in non-scalar indexing within to_indices. That'd give them iteration, indexing, and shape without much hand-wringing… and that could actually remove a few methods. I'm not sure if there'd be a performance impact, however, and I'd like to keep that patch as conservative as possible for now. It's already pretty big.

I increasingly think we're not going to do this. We could make a lint warning that pesters you if you write for i = x where x is not a range expression. That would catch the cases where someone meant to write for i = 1:n and accidentally wrote for i = n instead. We could even go so far as to make that a syntax error, but that seems too draconian.

write for i = 1:n and accidentally wrote for i = n instead

I often get the phenomena many times. Since it does not raise error e.g. syntax error, it is hard to find
for i = n should have wrote for i in 1:n 🐛

Yes, that was one of the motivations cited in this issue when it was opened.

I've just posted my question at Julia discourse (that is why i mentioned a comment at this issue).

https://discourse.julialang.org/t/question-why-for-n-in-10-show-n-end-is-valid-rather-than-getting-error/33895

It was not clear for me why number e.g. 10 is iterable.
But, by reading a discussion here https://github.com/JuliaLang/julia/pull/19700#issue-99274797. I've found it is difficult choice to decide.

Was the conclusion that this definitely isn't happening even for a Julia 2.0? I've seen several complaints/confusions about it in various places over the past few months.

It's gotta be rather convincing. We tried removing both iterability and indexability pre-1.0, but:

Removing iterability: I'm finding various cases where removing iterability requires much uglier code.

and

Removing indexability: Con: eliminating this functionality doesn't actually save us much code in Base, and it might make some kinds of generic functions more annoying to write, especially since numbers are still iterable. Is it worth it?

Some of these things have indeed changed, so it's certainly possible that the balance has shifted... but has it shifted enough? I'd bet not. It's quite a bit of churn.

I'm generally a supporter of iterability of numbers, but I have seen people get bit by it. To play devil's advocate, would it be so bad to change for d in dims to for d in iterable(dims) with

iterable(x) = x
iterable(x::Number) = (x,)

?

... or just toss whatever wrapper we end up using for broadcasting on a number and iterate that

Right, that's what makes this different — that iterable function is essentially a narrower form of broadcastable. We now have an entire architecture built up for this sort of thing.

My experience in trying to implement even a small piece of this pre-1.0 (#7903) leads me to believe that changing this would lead to a huge amount of code churn over the whole ecosystem. i.e. it wouldn't be worth it without huge benefits, which I haven't seen anyone articulate beyond "slightly confusing to some newcomers".

Was this page helpful?
0 / 5 - 0 ratings