Julia: Renaming findn and findnz?

Created on 4 Dec 2017  路  14Comments  路  Source: JuliaLang/julia

As discussed in #24724 these behave slightly different from other functions so it might be worth renaming them as part of the general work described in https://github.com/JuliaLang/Juleps/blob/master/Find.md

search & find

Most helpful comment

I've just realized the definition of findnz is totally inconsistent currently. For SparseVector it returns entries which are actually different from 0 (like findn did), but for SparseMatrix it just returns stored entries (zero or not). This confusion is probably due to the poor naming choice.

So I can see two scenarios:

  • findnz is really about stored entries: move it to SparseArrays now, and rename it at some point to change "nz" into "stored" or "filled" (there are several functions with "nz" that in that module). Fix the SparseVector method to keep stored zeros.
  • findnz(x) is the equivalent of find(!iszero, x) except that it also returns the values. So it's really the same naming question as for findmin/findmax (https://github.com/JuliaLang/julia/issues/24865). Either deprecate it in favor of find(!iszero, x) just like we just did for findn(!iszero, x), and reintroduce something with a different name if people complain, or find a name for this pattern immediately.

EDIT: actually the inconsistency has been recently introduced by https://github.com/JuliaLang/julia/pull/24724, which forgot to fix SparseVector. So it appears we have chosen the first scenario.

All 14 comments

As noted at #24724, findn wouldn't be needed if find returned CartesianIndexes for matrices, as https://github.com/JuliaLang/julia/pull/24774/ does. findn(x) could then be find(!iszero, x), with a specialized sparse matrix method for efficiency.

findnz could be renamed to stored or filled, which would have the advantage of moving away from the too restrictive "zero"/"nonzero" terminology. nnz should also be renamed at the same time.

findnz could be renamed to stored or filled, which would have the advantage of moving away from the too restrictive "zero"/"nonzero" terminology. nnz should also be renamed at the same time.

馃憤 Revising the names of sparse array type fields and associated functions to better reflect present semantics (and possibly enable SparseVector-SparseMatrixCSC unification) is on my list of hypothetical post-1.0 projects, so this direction sounds great on this end :).

@Sacha0 What concrete changes do you think should happen before 1.0 here? Renaming findnz to stored and nnz to something else?

How about Sparse.stored and Sparse.nstored?

@Sacha0 What concrete changes do you think should happen before 1.0 here? Renaming findnz to stored and nnz to something else?

The findn(x) -> find(!iszero, x) change sounds lovely and realistic for 1.0. Is that change pending or has it already gone through?

findnz(A) generates a COO (coordinate) representation of A from its nonzeros, independent of what A stores. A name reflecting that behavior seems best? (stored does not quite capture that behavior, and perhaps should be reserved for other uses in a future overhaul of sparse arrays.)

nnz -> numstored is what I had in mind for a future overhaul; I prefer numstored's clarity to nstored :). Best!

The findn(x) -> find(!iszero, x) change sounds lovely and realistic for 1.0. Is that change pending or has it already gone through?

It's still pending, blocked by https://github.com/JuliaLang/julia/pull/24774. I think we should merge that PR anyway and take care of performance later.

findnz(A) generates a COO (coordinate) representation of A from its nonzeros, independent of what A stores. A name reflecting that behavior seems best? (stored does not quite capture that behavior, and perhaps should be reserved for other uses in a future overhaul of sparse arrays.)

OK. Do we really need both findn (or it's replacement) and findnz? I'm not very clear about their differences.

nnz -> numstored is what I had in mind for a future overhaul; I prefer numstored's clarity to nstored :).

I prefer nstored, since "num" evokes "numeric" and because we already have e.g. nfield. But that's not a big deal.

To be honest, I wouldn't mind some help for issues which are not blocked by progress on my other PRs. :-)

It's still pending, blocked by #24724.

That's merged already though...

OK. Do we really need both findn (or it's replacement) and findnz? I'm not very clear about their differences.

Insofar as I am concerned, no. I can't remember the last time I used findnz; usually a better way to achieve whatever you want to achieve with it exists :).

That's merged already though...

Woops, sorry, I meant https://github.com/JuliaLang/julia/pull/24774.

https://github.com/JuliaLang/julia/pull/25532 deprecates findn. findnz could either be also deprecated, or moved to the new SparseArrays stdlib module and possibly renamed.

Only findnz remains at issue with #25532 merged? Thoughts regarding the best course of action with findnz? Best!

Thoughts regarding the best course of action with findnz?

Kill it with fire. Well, at least rename it 馃槢 鈥撀爐he name doesn't at all suggest what it does.

I've just realized the definition of findnz is totally inconsistent currently. For SparseVector it returns entries which are actually different from 0 (like findn did), but for SparseMatrix it just returns stored entries (zero or not). This confusion is probably due to the poor naming choice.

So I can see two scenarios:

  • findnz is really about stored entries: move it to SparseArrays now, and rename it at some point to change "nz" into "stored" or "filled" (there are several functions with "nz" that in that module). Fix the SparseVector method to keep stored zeros.
  • findnz(x) is the equivalent of find(!iszero, x) except that it also returns the values. So it's really the same naming question as for findmin/findmax (https://github.com/JuliaLang/julia/issues/24865). Either deprecate it in favor of find(!iszero, x) just like we just did for findn(!iszero, x), and reintroduce something with a different name if people complain, or find a name for this pattern immediately.

EDIT: actually the inconsistency has been recently introduced by https://github.com/JuliaLang/julia/pull/24724, which forgot to fix SparseVector. So it appears we have chosen the first scenario.

https://github.com/JuliaLang/julia/pull/25641 moves findnz to the SparseArrays module.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

musm picture musm  路  3Comments

manor picture manor  路  3Comments

iamed2 picture iamed2  路  3Comments

sbromberger picture sbromberger  路  3Comments

i-apellaniz picture i-apellaniz  路  3Comments