I just did a quick search on github of for other issues asking for this feature. I didn't find any, so I'm filing this issue.
Given we have mean(f, x)
, count(f, x)
etc. It would be nice if sort(f, x)
also worked.
Currently I can use sort(x, by = f)
. But for convenience and consistency this method would be nice.
The counterpoint to this is that both by
and lt
are possible function arguments, but I realized that we can actually look at the function that's passed in and see if it has two or one argument methods and determine whether it's a transformation or a comparison function based on that. So 👍 would be a nice syntax.
I would say that with a docstring + the existing convention of mean(f, x)
etc. Having the first argument just refer to by
would probably be fine. But your approach also makes sense
The counterpoint to this is that both
by
andlt
are possible function arguments, but I realized that we can actually look at the function that's passed in and see if it has two or one argument methods and determine whether it's a transformation or a comparison function based on that. So 👍 would be a nice syntax.
I think a problem with this is that functions can have methods for both single and two arguments.
Is the proposal to make this equal to sort(x, by=f)
, or to sort(map(f, x))
? I would have guessed the latter, since I thought the rule was that outer(f, data)
is always a shortcut for outer(f.(data))
. This is true for mean
/maximum
/unique
, which otherwise return something the same eltype as the input, and for all
/count
/findall
which don't.
Edit: The issue I was thinking of is #27613. The new method there findmax(f, domain) -> (f(x), x)
doesn't precisely follow the broadcast rule, since the one-argument findmax(f.(domain)) -> (f(x), i)
regards its domain as being the indices. But it does apply f
to the data in what's returned, rather than (say) using it to decide the largest, and then discarding what f(x)
. That's the behaviour I'd expect from findmax(domain; by=f) -> (x, i)
.
but I realized that we can actually look at the function that's passed in and see if it has two or one argument methods and determine whether it's a transformation or a comparison function based on that.
:-1: to that. That type of function reflection should not be used for a public API imo. Just trying to document this becomes kind of awkward:
If the function only has methods that accept one argument then it means this thing, if the function only has methods that accept two arguments, then it means this completely other thing, if the function has methods that accept a mixed number of arguments, then it errors...
Is the proposal to make this equal to
sort(x, by=f)
, or tosort(map(f, x))
? I would have guessed the latter, since I thought the rule was thatouter(f, data)
is always a shortcut forouter(f.(data))
. This is true formean
/maximum
/unique
, which otherwise return something the same eltype as the input, and forall
/count
/findall
which don't.
I'm proposing to make it equal to sort(x, by = f)
. I get what you mean about sort(map(f, x))
though. That's a valid point.
Turns out I'm wrong about how unique
acts, and surprised. (It's not new, added in 2015.) What is the rule here?
julia> rn = rand(Complex{Int8}, 5);
julia> sort(rn, by=real) # very clear
5-element Vector{Complex{Int8}}:
-110 + 5im
42 - 46im
58 + 62im
65 + 31im
110 - 124im
julia> unique(real, rn) # not what I expected
5-element Vector{Complex{Int8}}:
65 + 31im
-110 + 5im
42 - 46im
58 + 62im
110 - 124im
julia> Base.sort(f, itr) = sort!(collect(f(x) for x in itr)) # what I expected
julia> sort(real, rn)
5-element Vector{Int8}:
-110
42
58
65
110
Most helpful comment
:-1: to that. That type of function reflection should not be used for a public API imo. Just trying to document this becomes kind of awkward: