Julia: Rename findmin and findmax?

Created on 30 Nov 2017  路  19Comments  路  Source: JuliaLang/julia

findmin and findmax return a tuple with the index of the maximum/minimum, and the corresponding value. This behavior is kind of inconsistent with find, findfirst, findlast, findnext and findprev, which return indices without values. On the other hand, indmin and indmax return only the indices.

It would be more consistent to rename indmin/indmax to findmin/findmax. Of course this is not possible in a single deprecation cycle. I can see two solutions:

  • Deprecate findmin and findmax in 0.7, and rename indmin and indmax to findmin and findmax in 1.0. This has the drawback of breaking things between 0.7 and 1.0.
  • Rename indmin and indmax to findmin and findmax in 0.7. This introduces breaking changes in 0.7, but at least there wouldn't be any changes in 1.0.

Finally, it's not clear to me whether we really need the current behavior of findmin/findmax, compared with getting the index and then extracting the corresponding element manually. The performance gain should only be significant for very small collections AFAICT.

search & find

Most helpful comment

I still like argmin and argmax personally.

All 19 comments

Isn't this basically what is usually called argmin and argmax (although those are not great names)?

Isn't this basically what is usually called argmin and argmax (although those are not great names)?

And indeed: https://github.com/JuliaLang/julia/pull/1655.

Isn't this basically what is usually called argmin and argmax (although those are not great names)?

And indeed these were the first proposed names for these functions: https://github.com/JuliaLang/julia/pull/1655.

EDIT: see also discussion at https://github.com/JuliaLang/julia/pull/7327.

I still like argmin and argmax personally.

I still like argmin and argmax personally.

For the method returning just the index (current indmin and indmax), or for that returning both the index and the value (current findmin and findmax)? What name do you suggest for the other pair of methods?

Do those need to be separate methods? Is there a cost to always just returning both the index and the value?

As I noted in the OP, there's a "consistency cost" (if that's a thing) since find, findfirst, findnext, findlast and findprev all return a single index. That's not the end of the world, but IMHO it's better to use similar names for functions behaving similarly to make the language more predictable.

I think we may have to hammer this one out with @JeffBezanson when he gets back as part of https://github.com/JuliaLang/julia/issues/10593. I hate to break the deadline, but I think this needs his input. It will be easier to work though after we've gotten all the other stuff out of the way.

I'm ok with renaming indmin to findmin, but I would really like to keep the functionality of returning both a minimum and its index. There are plenty of structures where an extra lookup could be expensive (e.g. a database). We also have findmin! and N-d versions, so the findmin functions actually include quite a bit of functionality we don't want to drop.

Maybe we could call it minandind? That's a bit hard to read but I don't have a better idea yet.

I also think this is low-urgency and doesn't need to be done for 1.0.

FWIW, we have a series of functions with "and" in StatsBase: mean_and_std, mean_and_var and mean_and_cov (not sure about the underscores). Though they are a bit different since they compute two quantities rather than returning the index.

We could also find a different term to indicate the action of returning both the value and the index: fetch, pick, take, select?

Or minpair, to borrow terminology from the usage of Pairs elsewhere.

but I would really like to keep the functionality of returning both a minimum and its index.

I'd prefer that we general move in that direction and return the value in the find family of functions. Returning the value together with the index should be really cheap in general.

That's a possibility, but that's going to be much more breaking that what I already have as part of #10593. That would also introduce an inconsistency with find (for which returning an array of values would be quite costly), and an inconvenience in the relatively frequent case where you don't need the value (in particular when using the equalto predicate, for which it's kind of ridiculous to return the value).

I think it would be simpler to have the find* return only indices as they do now (except for findmin/findmax), and introduce later new functions which would return both the index and value. We could then reorganize the internal implementation as appropriate if needed.

From triage: does not need to be in 1.0.

https://github.com/JuliaLang/julia/pull/25654 renames indmin and indmax to argmin and argmax.

Julia user suggestion: please save others time by mentioning in docs for argmin/argmax the ind2sub function.

A = [1 2; 3 5]
indmin(A) # 1
# but I want the rows and columns!
ind2sub(A, indmin(A)) # (1, 1) what I wanted

Something like "For multi-dimensional arrays, the ind2sub() function will convert the index to a tuple."

argmin now returns a CartesianIndex in this case, so that behavior has basically become the default.

Note that ind2sub(dims, ind) has been deprecated to (CartesianIndices(dims))[ind] (which has the advantages that CartesianIndices(dims) is an immutable object which can be created once instead of for each scalar value of ind and that it can be used in indexing expressions).

Seems like we settled on something, given that 1.0 is not accepting deprecations anymore. If someone disagrees, post a comment on the further change needed.

Was this page helpful?
0 / 5 - 0 ratings