In #33054 I proposed to expose AbstractSparseMatrixCSC interface so that it is possible for package developers to easily write SparseMatrixCSC-like custom matrices (without worrying about implementing all the complex functions including the broadcasting machineries). @ViralBShah was asking in https://github.com/JuliaLang/julia/pull/32953#issuecomment-524173844 if it makes sense to deprecate field access. The rationale is that it pushes package authors to use AbstractSparseMatrixCSC interface so that subtypes of it other than SparseArrays.SparseMatrixCSC would work for their code. This is technically OK since those fields are not a part of the public API. (Edit: It may not be considered completely private. See: https://github.com/JuliaLang/julia/issues/33056#issuecomment-524583635)
As the accessor like nonzeros have exited for more than 4 years since Julia 0.4 (#8720), I'd assume that all existing and maintained code base already have migrated to the accessor methods. If that's the case, this deprecation would not be very destructive (although the effect would be small at the same time).
As a side note, I added getproperty definitions for SparseMatrixCSC and SparseVector in test suite in #32953 so that we can enforce don't-use-fields rule in the future development of SparseArrays.jl itself. So, enforcing the rule inside SparseArrays.jl is not a strong enough argument to ban field access for users.
(ping @fredrikekre as you reacted in the previous discussion)
Thinking about this a bit after the previous discussion https://github.com/JuliaLang/julia/pull/32953#issuecomment-524173844, I'm now somewhat against deprecating or banning the field access. It is very trivial for new AbstractSparseMatrixCSC subtypes to implement SparseMatrixCSC-compatible getproperty. So, making old code base compatible with (SparseMatrixCSC-compatible) AbstractSparseMatrixCSC subtypes is as easy as relaxing the type constraints. Considering there is unknown amount of public and private code bases using SparseMatrixCSC I don't think it would make sense to break them just for pinging them that we now have AbstractSparseMatrixCSC.
I would personally be in favour of deprecating and generally having everyone use accessor methods for all sparse data structures going forward.
But, at the same time, we probably have many other things to do in sparse matrix land and this is perhaps not the most pressing issue. While the implementation of this is straightforward, it may cause quite a bit of follow-up work as it propagates through the packages.
How about we leave this issue open to collect comments for now (while we focus our attention on other things)?
+1 for leaving it open and focusing on other issues. We can come back to this anytime later.
Actually, the fields of SparseMatrixCSC are mentioned in the documentation
It explicitly says "internal representation" so I don't know if it is considered a public API though.
I believe those are documented for the purpose of explanation. Internal representation certainly means "do not count on these".
The only way to use and modify SparseMatrixCSC efficiently in many situations is to use these field so they are de facto public. Just removing/changing their name is likely to be very disruptive to the ecosystem (grepping through packages for .colptr etc should make this clear)
There are like five issues/PRs discussing changing the sparse array APIs at this point. And the one that Simon opened with the clearest plan has been closed in favor of a PR with an unclear plan and breaking changes that we definitely cannot make. Can we reopen the original plan issue and actually come up with a coherent plan before making any more half-baked PRs changing things?
The one that Simon had was not about API but rationalization of field names, which is something we are clearly not doing. I don't see the point about opening up that discussion again.
What half-baked PRs are you talking about? It can be confusing to follow along since the work is spread out across several PRs, but I don't see how you can label these contributions as half-baked.
There are like five issues/PRs discussing changing the sparse array APIs at this point.
That's why I closed #33050 in favor of #33054 (see https://github.com/JuliaLang/julia/pull/33050#issuecomment-524616744). I thought to keep opening this issue makes sense as #33054 is about API addition, not deletion.
Simon opened with the clearest plan has been closed in favor of a PR with an unclear plan and breaking changes that we definitely cannot make.
There is nothing breaking in #33054 at all. Or at least that has been my intention. Please comment in #33054 if you find anything breaking.
I apologize about calling your PRs half-baked. It was uncalled for and shitty of me. I appreciate all your work on these things and don鈥檛 want to discourage it. I鈥檓 frustrated about a few things:
My proposal: either reopen Simon鈥檚 issue proposing better names or create a new issue (not a PR) to discuss what to do about the sparse APIs; come to some kind of agreement on that issue and only then set about executing the plan.
I do not think that we can break existing code that accesses fields, but we can rename fields and provide getproperty methods that allow the old names to continue to work. So pick the names you want, change them to those, make the old ones work, update the docs.
I have reopened https://github.com/JuliaLang/julia/issues/25118 that @simonbyrne opened about fieldnames for SparseMatrixCSC.
@StefanKarpinski I think I understand your frustration about linear algebra / sparse arrays code base. I imagine many maintainers and contributors share similar feelings. (Off topic, but it would be nice if these stdlibs can go to Pkg-like development mode which may mitigate some of the issues.)
I summarize the recent activities related to SparseMatrixCSC in the issue opened by @simonbyrne https://github.com/JuliaLang/julia/issues/25118#issuecomment-524654446. It would be nice if you can share thoughts about it there.
Most helpful comment
I apologize about calling your PRs half-baked. It was uncalled for and shitty of me. I appreciate all your work on these things and don鈥檛 want to discourage it. I鈥檓 frustrated about a few things:
My proposal: either reopen Simon鈥檚 issue proposing better names or create a new issue (not a PR) to discuss what to do about the sparse APIs; come to some kind of agreement on that issue and only then set about executing the plan.
I do not think that we can break existing code that accesses fields, but we can rename fields and provide getproperty methods that allow the old names to continue to work. So pick the names you want, change them to those, make the old ones work, update the docs.