findmin
and findmax
return a tuple with the index of the maximum/minimum, and the corresponding value. This behavior is kind of inconsistent with find
, findfirst
, findlast
, findnext
and findprev
, which return indices without values. On the other hand, indmin
and indmax
return only the indices.
It would be more consistent to rename indmin
/indmax
to findmin
/findmax
. Of course this is not possible in a single deprecation cycle. I can see two solutions:
findmin
and findmax
in 0.7, and rename indmin
and indmax
to findmin
and findmax
in 1.0. This has the drawback of breaking things between 0.7 and 1.0.indmin
and indmax
to findmin
and findmax
in 0.7. This introduces breaking changes in 0.7, but at least there wouldn't be any changes in 1.0.Finally, it's not clear to me whether we really need the current behavior of findmin
/findmax
, compared with getting the index and then extracting the corresponding element manually. The performance gain should only be significant for very small collections AFAICT.
Isn't this basically what is usually called argmin and argmax (although those are not great names)?
Isn't this basically what is usually called argmin and argmax (although those are not great names)?
And indeed: https://github.com/JuliaLang/julia/pull/1655.
Isn't this basically what is usually called argmin and argmax (although those are not great names)?
And indeed these were the first proposed names for these functions: https://github.com/JuliaLang/julia/pull/1655.
EDIT: see also discussion at https://github.com/JuliaLang/julia/pull/7327.
I still like argmin
and argmax
personally.
I still like argmin and argmax personally.
For the method returning just the index (current indmin
and indmax
), or for that returning both the index and the value (current findmin
and findmax
)? What name do you suggest for the other pair of methods?
Do those need to be separate methods? Is there a cost to always just returning both the index and the value?
As I noted in the OP, there's a "consistency cost" (if that's a thing) since find
, findfirst
, findnext
, findlast
and findprev
all return a single index. That's not the end of the world, but IMHO it's better to use similar names for functions behaving similarly to make the language more predictable.
I think we may have to hammer this one out with @JeffBezanson when he gets back as part of https://github.com/JuliaLang/julia/issues/10593. I hate to break the deadline, but I think this needs his input. It will be easier to work though after we've gotten all the other stuff out of the way.
I'm ok with renaming indmin
to findmin
, but I would really like to keep the functionality of returning both a minimum and its index. There are plenty of structures where an extra lookup could be expensive (e.g. a database). We also have findmin!
and N-d versions, so the findmin
functions actually include quite a bit of functionality we don't want to drop.
Maybe we could call it minandind
? That's a bit hard to read but I don't have a better idea yet.
I also think this is low-urgency and doesn't need to be done for 1.0.
FWIW, we have a series of functions with "and" in StatsBase: mean_and_std
, mean_and_var
and mean_and_cov
(not sure about the underscores). Though they are a bit different since they compute two quantities rather than returning the index.
We could also find a different term to indicate the action of returning both the value and the index: fetch
, pick
, take
, select
?
Or minpair
, to borrow terminology from the usage of Pairs
elsewhere.
but I would really like to keep the functionality of returning both a minimum and its index.
I'd prefer that we general move in that direction and return the value in the find
family of functions. Returning the value together with the index should be really cheap in general.
That's a possibility, but that's going to be much more breaking that what I already have as part of #10593. That would also introduce an inconsistency with find
(for which returning an array of values would be quite costly), and an inconvenience in the relatively frequent case where you don't need the value (in particular when using the equalto
predicate, for which it's kind of ridiculous to return the value).
I think it would be simpler to have the find*
return only indices as they do now (except for findmin
/findmax
), and introduce later new functions which would return both the index and value. We could then reorganize the internal implementation as appropriate if needed.
From triage: does not need to be in 1.0.
https://github.com/JuliaLang/julia/pull/25654 renames indmin
and indmax
to argmin
and argmax
.
Julia user suggestion: please save others time by mentioning in docs for argmin
/argmax
the ind2sub
function.
A = [1 2; 3 5]
indmin(A) # 1
# but I want the rows and columns!
ind2sub(A, indmin(A)) # (1, 1) what I wanted
Something like "For multi-dimensional arrays, the ind2sub()
function will convert the index to a tuple."
argmin
now returns a CartesianIndex
in this case, so that behavior has basically become the default.
Note that ind2sub(dims, ind)
has been deprecated to (CartesianIndices(dims))[ind]
(which has the advantages that CartesianIndices(dims)
is an immutable object which can be created once instead of for each scalar value of ind
and that it can be used in indexing expressions).
Seems like we settled on something, given that 1.0 is not accepting deprecations anymore. If someone disagrees, post a comment on the further change needed.
Most helpful comment
I still like
argmin
andargmax
personally.