Once #4883 is fixed, this would enable things like:
x=["1","2"]
y=parse.(Int64,x)
z=convert.(Float64,y)
These would be super handy when one is e.g. converting types of a DataFrame etc.
x-ref https://groups.google.com/forum/#!topic/julia-users/BM1QFPoG_24
I don't like this behaviour at all. This will be terribly incompatible with map
, which doesn't do this. Currently we throw an error, which is at least explicit. Think of the behaviour on collection-like scalars, such as String
or Tuple
. It would be horrible for length.(([1, 2], [3, 4], [5, 6]))
to give 3
, or even worse, [3]
!
Why not something like
cell(x) = setindex!(Array(typeof(x)), x)
so we can explicitly do broadcast(cell(Int64), x)
? cell
is deprecated now so it can return by 0.6. It could be made into a type that behaves like the zero-dimensional array but is much faster.
I don't like using the cell
name for that. Doesn't Ref
already do that?
But Ref
does not behave like an array type, and I'm not sure it's a good idea to overload its meaning that way.
@TotalVerb map
doesn't broadcast, so it's logical that it shouldn't work. I don't see the problem. And what's the relation with length.(([1, 2], [3, 4], [5, 6]))
?
Isn't it better for that to throw an error than to return 3?
Why would it return 3
?
Because Tuple
is a not an AbstractArray
, so is "scalar-like", and will be auto-promoted to an array of tuples. And otherwise we would have to special-case Tuple
as "vector-like". And same goes for String
, or Set
, or Dict
, etc.
But I don't think anybody argued that tuples should be treated as scalars.
Then what should be treated as a scalar? This is not clear cut. What is a scalar sometimes is not a scalar other times.
For instance, are strings scalars? What should broadcast(*, ["Hello", "Goodbye"], " World!")
produce? Or are strings vectors? What about broadcast(+, 1, "Hello")
?
Are sets scalars? What of broadcast(union, Set([1, 2, 3]), [Set([4]), Set([5])])
? How about broadcast(x -> x^2, Set([1, 2, 3]))
?
My point is that whether something is a scalar or not is not a property of the type. Any division of types into "scalar" and "tensor" is going to be inaccurate in some cases. In x + y
both x
and y
are scalars, even if they are of type Vector
. In x .+ y
, they are both tensors, even if they are of type Number
(since Number
implements the tensor interface). And that is despite these being the same operation.
Unfortunately, we don't have traits yet, but there's a section of the manual about collections, and everything listed there shouldn't be treated as a scalar: basically, everything that supports the iteration protocol or getindex
. That clearly includes sets, tuples, dictionaries, and strings. String are a weird kind since I'd often like to treat them as scalars, but I think for consistency we need to consider them as collections everywhere.
There's a pretty solid argument to be made for making strings scalar and requiring chars(str)
to iterate the characters of a string (similarly graphemes(str)
, etc.).
Could someone triage this? I.e. assign some labels and maybe a milestone to this? #16986 seems to solve this, but neither one is assigned any label/milestone, so probably both are being missed in any release planning process.
What label do you propose? I don't think this is release-blocking.
Definitely not release blocking, I'd say.
The current system of labels and milestones used in the repo is a bit of a mystery to me, so hard to tell what it should get assigned, but not having anything seems not ideal for something that is actively being worked on and therefor might go into a 0.5 release or not.
It would be nice to get this resolved for #16285, because in Julia 0.6 we were planning to make a .+ b
etcetera be synonyms for (+).(a,b)
and hence be a fusing broadcast operation. When I tried to implement this, one of the first problems that arose was the .==
operator, because there is existing code that uses e.g. symbolarray .== symbol
.
My first inclination is that broadcast
should default to treating things as scalars, and only for certain exceptions like AbstractArray
and perhaps Tuple
should it iterate.
I don't think that we necessarily want map
and broadcast
to be consistent here. (If their behavior is identical, why even have both functions?) broadcast
should be for "iterables with shapes" and should default to treating arguments as scalars, and map
should be for "iterables without shapes" and should default to treating arguments as iterables.
That seems like a solid distinction to me and well expressed.
Would it be crazy to propose an alternative to .
as an operator for map?
I quite like using a binary operator for map in my code (usually a variant of the compose circle). I don't think we need more dot-like syntax.
I'm just thinking that slightly expanding this broadcasting idea to a few other operators could lead to some really slick notation not just for mapping, but also for doing embarrassingly parallel/multithreaded operations if combined with some ideas from JuliaLang/julia#17887. Running out of characters that would look nice, but a version of A.*B.+sin.(C)
which automatically fuses the operations and applies a parallel map would be the sickest feature ever. And that would all be possible if there was just a way to specify it being map
, pmap
, pbroadcast
, etc. (or a macro could be used to specify the "apply" function that the .
operator does. That could be a cheap fix that would be easy to implement in a package).
@ChrisRackauckas, I agree; see the discussion at the end of #16285. The trick is getting the details right.
Closed by #16986 (@pabloferz), I believe.
Most helpful comment
There's a pretty solid argument to be made for making strings scalar and requiring
chars(str)
to iterate the characters of a string (similarlygraphemes(str)
, etc.).