Julia: Document privacy of type-fields

Created on 8 Jul 2015  Â·  85Comments  Â·  Source: JuliaLang/julia

It is not documented that fields of types are considered private (at least by some people), as @nalimilan mentioned over in julia-users. It's probably worth giving some guidelines in the documentation when fields are considered private. As that is probably not always true, for instance when using a type just to store stuff.

Also, using an leading underscore is sometimes used to denote privacy of methods or (extra?) privacy of fields. When should _ be used vs just assume privacy of fields? When should methods use a _?

It is probably worthwhile to reach some sort of consensus about this and documenting it before going forward with field overloading (#1974 and PR #5848).

doc

Most helpful comment

The road to C++/Java is paved with good intentions. -1e6 to any sort of public/private wankery.

All 85 comments

Tagging all fields you want to keep private with a _ is a lot more of a bother than simply adding public/private keywords to the definitions where you can actually get a warning or error if misused.
If the default is that fields of types should be considered private (something I've yet to see in the Julia codebase or packages), then maybe using a "public" keyword would be better.
Simply documenting something that is not followed in practice is not really that helpful.

@ScottPJones: I think that is a separate issue: language feature vs documentation/convention. (Not sure whether one has been opened yet).

Field overloading would give the same flexibility to the dot notation as the @property annotation in Python, right? In that case we could just adopt the same convention as Python, fields are considered public by default unless they start with an underscore. Direct field access is no longer problematic since the implementation can now change in a backwards compatible way by overloading the field access.

Yes, that should be opened as well. I still stand by my last point, that documenting something that people simply don't follow, or feel is incorrect (see @tknopp's comments about immutables in the same julia-users discussion) isn't going to do much good. The horse has already left the barn.

I'm in favor of a _ convention as well. It would also be easy enough to add it to Lint.jl. We might also want to use the same convention for functions that are private to a file. @ScottPJones: The public/private syntax always seemed a bit verbose to me, but I could be convinced otherwise.

If the default for all fields of a type is private, then you are simply adding 7 characters "public " to those fields. On the other hand, if you use a _ convention (which doesn't really buy you any assurance at all that people won't use your internals!), you'll be adding a _ for most every field, and most every field reference. I think _ is a _lot_ more verbose in practice.

Also, for 29 years I worked on a language with probably >100,000 programmers using it, and before we added public/private (default private for functions in a module) (which was about 18 years ago, IIRC), we constantly had problems with programmers using internal interfaces, and then complaining that we broke things they depended on. I'd like the programmer community of julians get to be as large or larger, and I'd hope that julia not repeat mistakes I've had to painfully deal with in the past.

Why do you only consider the PoV of the writer of a library (/piece of code) ?

I've been in the opposite situation many times when I'm using someone's code and they made some things private because, after all, it's "good practice" to hide most of your internals. However, I know what I'm doing, I read the source, the compiler could do it but refuses because of some keyword ?

In other words, it's not only mindless idiots using your pristine code, sometimes you are also using some mindless idiot's code ;-)

As currently envisioned, I think getfield/setfield overloading would facilitate a privacy convention for people who want it, without any other changes to the language. To @carnaval's point, fields would still technically be accessible using Core.getfield (or whatever it ends up being called), but anyone who does that will own their breakage.

No, I'm actually considering both. As a user of somebody else's code, I'd like to know just what the real API is, what the contract I have with the module / library / package.
That way I don't waste any of my time when the owner of that package decides to rewrite everything.

If you see that the code you are using does not give you some needed functionality in its API, then you can, depending on if it is open source or not, then you'd file an enhancement request (that's what I would get from customers), for open source, you'd raise an issue (if you don't know how to fix it yourself), or submit a PR (which a smart guy like you would probably do!). For open source, you'd even have the option of forking the darn thing if the author(s) are not responsive (or died, or whatever), and people could start using your new improved version that does have the functionality you want.

Mucking around in the internals only helps you, makes your code more fragile, and doesn't help anybody else. Doing the above helps the entire community.

I've been on both sides of the fence throughout my career, and I've had to deal with my own share of mindless idiots! (luckily, I haven't seen run into any yet in the julia community [we may disagree, yes, but I do know they are brilliant])

Maybe this could be handled like deprecations. privatewarn == 0 means no warnings,
privatewarn == 1 means a single warning, privatewarn == 2 means give an error if some mindless idiot is mucking about in my beautiful code! :grinning:
Would that make this not such a bother to you?

The road to C++/Java is paved with good intentions. -1e6 to any sort of public/private wankery.

@ihnorton The problem with that is, it is still just a convention, and is not easy to find out if people are breaking that convention. Also, having to use .. every time I just want to access the fields in my own types directly would be incredibly annoying, IMO.
@nolta The Julia code base and packages already have problems, because it has no mechanism to keep the abstraction and the implementation separate. This has nothing to do with C++/Java.
This is more about avoiding the "object orgy" that happens in a lot of dynamic languages. See https://en.wikipedia.org/wiki/Object_orgy
I also see this as allowing people to get a bit closer to the niceness that CLU had, _if they want to_.

getfield(::MyType, Any) = error("No!") would be something more than a convention. The existence of Core.getfield is an escape-hatch. If people use that, it's not my/your/our problem.

Also, having to use .. every time I just want to access the fields in my own types directly would be incredibly annoying, IMO.

so use setters and getters...

I tried to follow this already at the mailing list and questioned myself: where did i follow this or the other convention in the last ~30 years of programming? I entered object orientation late and always found the private/public differentiation as something obscure. I understand where it comes from and why it's really, really needed, but in _writing_ code and especially in rapid prototyping it's defining a speed limit.

Two things come here to my mind:
1) A real programmer can write fortran programs in any language -> If you try to stop people to express their ideas with certain language constructs, they'll certainly find ways around.
2) In SW Engineering all problems are communication problems -> If you cannot transport the message, don't use this, well...

tl;dr: a convention for the name that can be checked by a lint should be enough.

@ihnorton Why would I want the extra complication of adding setters and getters, just to access my own internal structures? That seems like a waste of my time.

@lobingera How do you run lint on the programs that hundreds of thousands of people have written, that you have no access to?
What happens if you make what you think is a minor internal change, and you break software running all over the world? (which can have severe economic effects as well, if your company is selling software).

@ScottPJones

Well, i somehow believe in the superiority of Open Source. Therefore the situation that i have no access to the 'other' code doesn't happen.
But actually i meant, having a strong warning and structural checking system on my side, when i contribute code should be enough.

@ScottPJones I'm still in favor of the convention and linting approach cause it seems like the path of least resistance. However, I suppose we could have a pub keyword that autogenerates the setters and getters for you. I don't think adding public/private is really going to solve your problem if you aren't using tests cases, linters, etc that should be finding most of these issues. Anyone who has used C++ knows that it can be extremely easy to break your software with a minor internal change ;) (ie: a memory leak).

-1 to language-level enforcement of private/public. This is a language
primarily for rapid implementation of scientific algorithms; not Java.

On Wed, Jul 8, 2015 at 11:33 AM, Andreas Lobinger [email protected]
wrote:

@ScottPJones https://github.com/ScottPJones

Well, i somehow believe in the superiority of Open Source. Therefore the
situation that i have no access to the 'other' code doesn't happen.
But actually i meant, having a strong warning and structural checking
system on my side, when i contribute code should be enough.

—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/12064#issuecomment-119627505.

Yeah, a huge -1 to mandatory access enforcement from me too. In C++, they only get in the way if you're trying out something new before you know what the right interface is. And if you do know what the right interface is, having to access private fields is a code smell that could be caught by Lint.

@Rory-Finnegan the convention and linting approach does nothing to help if you don't have access to the code _using_ the modules/packages that you write. Before we added the ability to _optionally_ use public/private tags for functions, properties, and methods, we used to waste a very large amount of time due to customers having abused some code that was supposed to be internal only (to say nothing of the problem of said customers being upset that their wonderful code stopped working).

@malmaud Why should julia be limited to scientific computing? That seems very narrow. Julia seems to me to be able to replace C, C++, Java, and Python for most of my programming (and I'm not doing scientific computing).

@keno What is the problem with an optional feature, that the author of a module can use or not as they see fit? Also, if it is controlled via a switch like depwarn, it can be turned complete off for "rapid application development" (that could even be the default setting). There is nothing "mandatory" about what I've been proposing for Julia.

I understand where it comes from and why it's really, really needed, but in writing code and especially in rapid prototyping it's defining a speed limit.

@lobingera So, you do understand that encapsulation _is_ really, really needed.
I've never found that the _ability_ to mark things as internal (private) or externally visible as ever slowing down my writing any code, nor did it slow down any of the customers (they were very happy about it, after it was introduced) (and the language I worked on was for precisely "rapid application development").
The customers were all writing code that had to be maintained reliably for decades, in mission critical applications (hospitals, banks, on-line trading, etc.).

To me, always having to run Lint while I'm developing something, is a much bigger speed limit than simply having the compiler warn me immediately.

I think, this could be done somewhat like this:

  1. Non-annontated fields are a grey area and are accessible always (Lint can have a switch to add warnings for non-annotated fields as that warning should not be default).
  2. A keyword private or some other keyword is created that marks it as a private field. This field can be accessed then by implementing either getfield, setfield! or both depending on your needs. This would then allow for the field to be validated as it is being set externally for example or creating a read-only field.

So this would be something extra that you only would add if you felt it was necessary for your work. I personally would never use it myself. However, if I wanted to use Julia in a field such as finance... I would be far more careful about encapsulation. This would simply be a way of enforcing encapsulation when necessary. I think it would be a useful feature but one that would not and should not be used very often. This way would also not break anything and not change anyone's workflow.

However, this isn't urgently needed. Right now I don't think very long running or critical systems are being written in Julia. I also feel that Julia is useful outside of scientific computing. That currently though is not its focus. This probably should be revisited at a later date once field overloading has been implemented as without it this can't be implemented as nicely.

Yeah, a huge -1 to mandatory access enforcement from me too. In C++, they only get in the way if you're trying out something new before you know what the right interface is. And if you do know what the right interface is, having to access private fields is a code smell that could be caught by Lint.

it certainly doesn't stop LLVM from iterating their API continuously. additionally, in many cases, they enforce the public/private distinction via public / private headers – i'm not sure how well that transfers to a language based on dynamic reflection instead of headers, however.

Could this be done in a package? E.g. with a @private macro? Taking a keyword seems a mite premature.

_As mentioned _ prefix is used in python to denote "private", without enforcement. Python's done without._

@vtjnash No, this isn't a cure-all, but at least in my experience, having the _option_ of enforcing at least some level of encapsulation is critical to building reliable systems. Since I also stated that the default could be to simply not even warn or give error messages (a la depwarn), it wouldn't effect absolutely anybody who didn't need this functionality.
Just because the LLVM developers seem to like to change things every release (and there seems to be an endless stream of bugs that Julia has to deal with) doesn't mean that that is a _good_ thing!
How LLVM deals with public/private distinctions I think is totally orthogonal to how Julia might deal with it.

@hayd if it can be done with say @public and @private macros, that would be just fine, I really don't care so much about the syntax, but rather the functionality I need to be able to deliver reliable systems that will still be working in 30 years time. I still don't have my head wrapped around how to deal with Julia meta-programming yet, so I have no idea what is possible.

@Mike43110 From conversations I had at JuliaCon, I think there are actually a number of people who would welcome anything that can be done to help writing maintainable, reliable, large applications in Julia. We (myself and the Belgian company I'm working for) are definitely interested in using Julia already for critical systems (which is why I take these issues so seriously).

Although Python gets by fine without any enforced privacy, Python is also notorious for being poorly suited for writing large scale reliable systems - lack of privacy of internals is only one factor among many, but it contributes.

@hayd As @tkelman noted above (in the julia-users thread), Python's unenforced _ convention really is not enough.

You also neglected to quote my other point. Prototype implementation or bust. Not worth spilling bits talking about it.

I responded to that in the julia-users group.
It's not spilling bits, I think it's been quite useful.
I wouldn't have thought that it would be possible to do this as a macro, not yet knowing all their ins-and-outs yet, but if so, then that would be fine for a prototype (like traits now, I suppose).

@ScottPJones @public and @private please! I am sure those people aren't interested in this.

I don't know enough about macros to know if it would be possible. It would make for a good prototype if possible though.

@ScottPJones: Please stop quoting things out of the context this is very misleading. I have made pretty clear that for mutable types it is common practice in julia that fields are private and that for immutables this is not entirely clear.

+1 for documenting the common practice and working on better interface support

Julia is already an excellent language for writing maintainable large scale applications. The type system including subtyping of abstract types helps a lot in this.

For what little that my opinion matters, -1 to making information hiding a part of the core language. +1 for better documenting what we consider to be idiomatic Julia, perhaps even a manual page on "Writing Idiomatic Julia Code".

I would also like to propose "Access Equality" to replace "Consenting Adults", putting myself firmly in the pro-equality camp.

@tknopp I never quoted you, I just said to look at your comments in the julia-users threads about immutables, how can that be "out of context", when I said to look at the context? You did fairly clearly state that there: Note that for immutable the fields are(!) the interface.

@ninjin If the ability to hide the implementation, in order to maintain a separation between abstraction and implementation, were simply optional, why would that be a problem for you?
I haven't said that making anything private would be the default for julia, just that I think the capability should be _available_, for those of us who want encapsulation.

Do people see it as a problem that they can't go directly accessing the internals of julia's boxing/unboxing and type system? (at least, I hadn't seen that so far). To me, that's a good example of where having the implementation details protected is a good thing.

I do like your proposed "Access Equality" terminology - I just don't think it is as black and white as people seem to be thinking, i.e. of us vs. them, this camp or that, Team Hide-Things vs. Team Everything-Goes.

If the ability to hide the implementation, in order to maintain a separation between abstraction and implementation, were simply optional, why would that be a problem for you?

Because more than once have I, and most likely all of us in the equality camp, encountered a well-meaning library/package designer that hid just the portion that we needed to tinker with. Sure, we could fork or submit a patch, but at the end of the day we just want to get work done and are prepared to take the potential breakage. Ultimately, giving the option to hide information will then just result in adding a way to unhide said information, thus just making things more complex, this is why I am against even an optional way to do it. @nolta put it best, although maybe a bit bluntly, "The road to C++/Java is paved with good intentions.".

I'd say, the road to constant breakage is paved with good intentions.

+a lot to documenting the convention that fields are private by default (unless documented otherwise). I definitely agree that it's important to be able to define the public interface to your code clearly, so that you are free to change the internals.

But when it comes to a mechanism for public/private fields, has anyone thought about how it could even be implemented in Julia? In a traditional object oriented language you can restrict access to private fields to the methods of the same class. But in Julia, it's not so obvious to the compiler which code is part of the implementation of which types (though it should hopefully be to the programmer). I'm not sure there's a practical definition of when an access is inside the implementation, and thus is fine even for private fields.

Could we first answer the fundamental question: Are fields considered to be part of the interface of a (mutable) type?

My vote is no. And the array/iteration interfaces are examples for this rule.

I agree, for mutables and immutables. Though it has to be possible for the author of a type to make an exception.

With field access overloading I would vote yes (just like in Python). The reason for this is that the dot syntax is so effin convenient and with overloading, changes to the implementation can be made in backwards compatible ways. This is similar to properties in Python, for a short summary see: http://blaag.haard.se/What-s-the-point-of-properties-in-Python/.

"Private" fields can then by convention start with an underscore.

@tknopp, @KristofferC. I'd vote Yes on field access i.e. putting no additional effort to restrict field access. But still a warning should be configurable that shows you are trying to access a field of a type within a module / outside the current scope or a field marked as _.

I think we need to distinguish between the question of whether we should put any effort in restricting field access, and whether fields should be fundamentally considered part of the public interface. As I understand it, @tknopp was going for the latter.

If fields never are a part of the interface I am scared this leave us with getter/setter hell for cases where you have a type and you actually want to manipulate/access the fields of the type. It also means we basically "waste" the dot-syntax which is such a convenient syntax. Compare:

get_vertices(get_element(mesh, n), 2)

and

mesh.elements[n].vertices[2]

To note, I have mostly programmed in Python so that is the (limited) mindset I have when I write this.

@KristofferC Your counter-example is a strawman. That would more likely look like

vertices(elements(mesh, n), 2)

or

vertices(elements(mesh)[n])[2])

The latter being almost exactly identical to the dot syntax (with two more characters).

Thanks @toivoh. That is exactly what I mean. These are independent things.

@KristofferC: This is not common Julia practice. Its under discussion in #1974. I think it is very important to drive this discussion from what is used today as a common practice. And until there is a majority of core maintainers that want the array length to be accessible by x.length (as an example) the status quo is that one writes functions to access fields of a type.

@nalimilan
The only thing you changed was renaming the getter? Naming the getter the same as the field has the problem of polluting the name space. You can no longer use the variable n and m in a function that uses sparse matrices because you then lose n(A) for the nr of cols. It could also be argued that the number of characters (and annoyance) of writing getters and setters for every "public" field should count against the function syntax. Anyway the point I tried to make had nothing to do with the number of characters.

I am a bit confused how you can say the function expression and the dot expression look the same. To me they look nothing alike. The problem I have with parsing the function expression is that the number n and 2 gets moved away from the object they are applied to. This means you always have to mentally parse the delimiters. With the dot syntax the item number and the collection are always next to each other and you just walk down the chain.

My personal opinion is that it is a bit sad to go the Java way here when this is a problem that (imho), for example, Python has already solved nicely with its "properties".

Far more effort has been put into this discussion than it would have taken to just prototype it; since I hate to see bits needlessly wasted, I went ahead and did that.

PrivateRyan.jl

/thread (except for the documentation bit, that's actually useful)

@tknopp Having length as a function makes sense because it has a meaning by itself. For many types length is not a field and you might ha e to traverse a list or something.

On the other hand we have n. Without the type SparseMatrixCSC, n has no meaning. This is also reflected in the functions taking a sparse matrix where A.n and A.colptr etc is used heavily.

I therefore think it is sensible to separate functions which a lot of types are likely to overload and fields which are unique for the type.

@ScottPJones (about lint and encapsulation)

Have you recently recognized that julia tries hard to hide the compiler from you? julia has the look and feel of a scripting language and encourages working at the REPL. I change and evaluate a long time until the code somehow converges into something usable and reusable and will then be copy+pasted into a .jl file or constructing a module. Along that i might run expressions through something like lint and also expect infrastucture in the editors i use to run code checks in the background. So much for the immediate output of the compiler.

For 'only' the marking of something being non-publicly accessible. You already (and if you actually have backround in this lexer/parser/AST/compiler business you should know) should have recognized that adding an optional keyword per field is changing language syntax - this is different than marking via namechange -> _fieldname - and has some drastic impact on the low-level implementation of the JIT compiler. It's doable but has some effort needed to be done correctly. Julia would allow to do things like this via macro - and that would be my recommended way if you really, really needed for your work. But the style of the discussion leads me to the conclusion that this should be somehow made mandatory.

And i fail to see the gain. Encapsulation replaces communication about the usage of the interface with enforcement of the interface. All the effort i'd rather spend on communicating the use of the interface e.g. docstrings, interface documentation etc.

But i'm quite aware, that all this is from my background, but i somehow agree with @nolta here...

@lobingera Note that you can click the time and paste the link to a single comment. The time (20h ago) will change unit/precision in a few hours and people won't be able to tell what you are replying to.

Edit: like this https://github.com/JuliaLang/julia/issues/12064#issuecomment-119933926

I see.

@KristofferC I don't think there is much need to use the internals of sparse matrices. n and m you should access through size, for most all uses nzrange, rowvals and nonzeros should give you all you need.

@one-more-minute, you're a macro-ninja! Let's see how @ScottPJones likes it.

@KristofferC: While I understand this issue, in practice I find rarely an example where a concept is really unique to a type. m,n of SparseMatrixCSC are of course absolutely private because of size(A,1) and size(A,2). As Mauro answers right now the same holds for the other internals.

Nobody who is proposing enforced privacy seems to have made any suggestions
of which methods those fields are private from. Since methods don't belong
to types, which of a method's arguments is it allowed to access the private
fields of?

While I think some kind of enforced encapsulation could be good, I don't
think that we yet have a good enough sense of what kind of modularity model
the language needs. That may well become clear after the pre-compilation
and interface work settles in, but until then I think adding enforcement is
premature.

Documentation, on the other hand, is a good idea, of course.

OK, a lot of what I've said seems to have been twisted around quite a lot.

  1. I'm not wanting something that mandatorily does _anything_
  2. I want a system (like the deprecation system) that communicates to the user of a module/package
    that certain functions, types, or fields of types should not be used, they are not part of the API,
    they are considered private, and may change at any moment with no forewarning.
    They may also depend on internal state, etc. and should be considered unsafe as well.
  3. As I've said a number of times, this controls visibility at the _module_ level (I think @StefanKarpinski may have muddied the water a bit here, it has nothing to do with methods belonging to types), it is very simple, it is the method _definition_ in the same module as the type (or method) _definition_. If the module has 10 "private" types, any method defined in that module can access all 10 types. If the module has a public type, that has some private fields, then any method defined in that module can access those private fields, and can call any function or method that is "private" to that module.
  4. It would be OK if that system allowed one to bypass that explicitly, if people really feel that is necessary, but, like @one-more-minute 's prototype, it should be explicit so that they know they are breaking the contract. This would be analogous to being able to call foo(), or having to call YourModule.foo().
  5. Julia already has the notion of different levels of access, in that there are inner and outer constructors for types.
  6. Julia already has the notion of different levels of visibility, with things being exported or not from a module. This would just add another level of visibility, i.e. not accessible from outside the module (at least, without an explicit override a la @one-more-minute)
  7. @StefanKarpinski's comment's about also thinking about read/write for fields is very good, that is similar to this. It would be kind of like adding traits to each field, :readonly, :private, etc.
  8. @one-more-minute's prototype is a good start, but I'd definitely not use anything that required explicitly decorating every access _within_ the module. Just like an unexported name can be accessed directly within a module, but needs Module.Name outside, "private" names should be accessibly freely within their module. (OT, I sure wish I had his (and Steven Johnson's) mad metaprogramming skills in Julia!)
  9. I like the _concept_ of #1974, but really dislike the idea to use .. to access "real" fields.
    (it's too useful elsewhere, and means other things like ranges in other languages, which would be confusable, what about .!, for example? A ! grabs your attention, helps point out visually that it is different from .. (which I also think is prone to accidents, one vs. two little dots))

At least I feel that some of the core team have come around to the idea that some form of enforced (but not too rigidly, I'm not asking for that either!) encapsulation in Julia could be a good thing, and I definitely agree, documentation is always a wonderful thing!

My question doesn't muddy the waters at all. What your proposing is a way of defining which methods are associated with types – your answer is that methods and types that are defined in the same module have a special relationship. Currently there is no such notion – modules are just "objects" where bindings to other objects live. Adding some notion of belonging is certainly possible, but not to be done lightly.

I think there already is such a notion, in that if you are in a module, you can reference unexported names directly, foo(), whereas outside the module, you need to use Module.foo().

I'm not talking about adding any new notion of visibiity, just extending what is already present in julia, instead of just "exported vs. public but not exported" to "exported, public but not exported, private to module"

I'm interested in the discussion of whether fields are considered private (also discussed in #7561, and at length in #1974). I think that discussion is separate from merit of new language features to allow enforced encapsulation. I know the horse has left the barn on this but would it be valuable to create a separate issue for the new language features and keep this more focused on what we want to put in the documentation?

Re: privacy of fields:
I just re-read #1974 and here are my thoughts:

  1. Direct field access should generally be limited to methods implementing the interface
  2. A modules exported methods serve as the interface to that module's types
  3. There's a limited subset where exceptions are made to these rules. I'd be interested in defining this subset.

One advantage (brought up by @JeffBezanson) is you can naturally use higher-order programming constructs like map, (e.g. map(length, items)) which is more awkward if your interface is defined as fields. (e.g. [i.length for i in items])

Long ago @StefanKarpinski wrote:

To play devil's advocate (with myself), should we write r.length or length(r)? Inconsistency between generic functions and methods are a problem that has afflicted Python, while Ruby committed fully to the r.length style.

I think now we've pretty firmly come down on length(r), but I wanted to note that in Python the syntax length(r) doesn't dispatch on the type of r, so it's much more tempting to do r.length. We don't have that problem. Dispatch in Julia is one of the most beautiful components of the language, and I think this is another place that it shines as a unifying principle.

@ssfrr Agree with everything you've said. Nicely put, and good point about the advantage of being able to use higher-order programming constructs.

If we would introduce a more explicit notion of public and private fields, I think that the module level would be the natural level. I really think there is a strong analogy to exported and unexported methods. Code in a module could have access to all fields of all types defined in the module. Then it could choose to export the ones it wanted to make public, and they would be public only to code importing from the module. If you really needed to access a private field without the system telling you not to, you could forcibly import that field from the appropriate module (or type?), or use getfield.

I think this setup would be a nice analogy with exported/non-exported getters and setters, and would allow to control the accessibility of a field in a more fine grained way than just public/private.

If you have a workaround to access "private" fields or methods, then it's really just punishing users when they need to use the workaround. When you use an underscored method outside the module you know you're doing "wrong"... but for whatever reason you need to do it. An additional hoop serves no purpose other than... ? spite?

If private is actually private, and it can't be accessed period, that's different. But this has to be something that can't be tricked with with temporarily overriding the module (or some other hack). Personally I'm with nolta.

But that assumes that you are going to put underscores in front of pretty much every field name, otherwise users will not know that they are doing something wrong. I personally would really like to avoid having to put underscores everywhere, I find it pretty ugly and it makes it seem like public fields are the default.

So you should rather take my suggestion as a suggestion for the only hoop that the user would have to jump through, in order to know that it was not supposed to access that field. Bit this is still speculative.

Now i'm lost. afaics Public fields are the default (and btw: current situation) and we want to document non-public fields by adding an underscore?

@KristofferC I don't think there is much need to use the internals of sparse matrices. n and m you should access through size, for most all uses nzrange, rowvals and nonzeros should give you all you need.

Writing fast sparse algorithms pretty much always requires direct access and modification of the colptr and rowval fields.

With Julia's smart in-lining, is that really true, if you have methods to access those fields?
Also, a method can be dispatched off of via it's type, as has been discussed previously.

It depends what you need to do. For many purposes nzrange is good enough, but there's no getter method directly for colptr at the moment.

Well, isn't there an easy solution to that?

I think the point is that for any type you might need to access the internals to do everything. However, for sparse matrices, I'd think 90% of the use-cases are covered with those functions.

I don't see the point in manually adding getter methods for every field of every type and changing all other code to use them for those cases where the data structure is an integral part of the API, and there's no way to make non-breaking changes anyway.

I imagine a grep of package code that uses sparse matrices would show that 90% guess to be overly optimistic.

I agree with @tkelman. When you have a data storage type, why would you want to write getter/setters which are just mydata(myobject) = myobject.data.

I do have a thought though. Is this necessary if name aliasing is allowed in #1974? If you wanted to mimic a private field you could just make a getter/setter return an error. Read-only fields would then be fields with only a get method added. You could still access the raw fields from .. or whatever operator is used for grabbing the raw value. A macro @private could be used to automatically generate the error methods.

For methods, modules already exist and using should be recommended as the default way to import methods. Public methods are then the ones exported and private are all the unexported methods.

@Mike43110 While, I'm fine with the public API of a module being defined by what it exports, I disagree with using being the mechanism by which that is enforced. My reasoning is that I like the explicitness and control over my namespace that import provides over using, but then I can obviously import anything I like without any warnings.

I agree that there are definitely times where the fields really are integral to the API and it seems like extra boilerplate to add accessor methods. I've always liked the opening to python's PEP8 which makes it explicit that the style guidelines are for encouraging default behavior and consistency, but not to be followed at the expense of clarity and good design.

So far it seems like @tkelman's sparse matrix example is the most concrete counter-example. Can you point me/us to some of the code that exemplifies where accessing the fields is the better API? I think it would be great to have some more examples of when the "methods-over-fields" rule should be broken.

Maybe that 90% number was reasonably accurate after all, since most quick searches for uses of the colptr field out in packages do indeed look like they could get away with using nzrange if they were targeting 0.4-only. Though there are a lot of times in personal code where I want to do things like append or delete or modify a column in-place, that require pushing or popping to colptr and appending/deleting from rowval and nzval. Sometimes you just don't have all that much abstraction over the underlying data structure, if you want to write high-performance code.

Where exactly is the issue? Don't you get rowval and nzval by the respective getter functions? It seems to be that the only thing missing is a getter for colptr. Although this entire example is a little tricky because these functions can be used to bring a Sparse matrix into an inconsistent state.

The Image type from Images.jl is a similar example, which enhances an ordinary array be additional information. There is however never a need to directly access a member of the Image type. Instead one calls data(im) and by this obtains the underlaying array.

The issue is adding getter functions seems like a pointless exercise if they'll only ever be used for a single type.

You may call it pointless but its common practice to decouple the interface from the implementation. The Image implementation shows that there are people that care for this distinction.

But this is all a little subjective. One can also have in C++/Java public member variables. Its simply discouraged to do so.

One could follow the Python/C# property business in order to have clean public properties. I am just not sure if its worth the complexity.

My argument is it's not always necessary and it doesn't always buy you anything except polluting your namespace. You aren't really decoupling the interface from the implementation if the getters are all trivial - you've just moved any potential renaming breakage from the field name to the getter name.

We don't have to argue about this point. It is very subjective and I am not a fan of enforcing anything in that direction.

The suggestion that fields are private is for people very used to OO. In Java/C# it is absolutely standard to program against interfaces and these don't expose fields.

(But again, the sparse matrix type is tricky because IMHO rowval and colptr are not really public because you can can render the object into an inconsistent state by direct access.)

If it were impossible to do anything risky, we'd end up in a situation like Matlab where you have to jump through hoops like writing a mex file to actually touch the data structures you sometimes need to work with.

I'm less convinced about safety and privacy on a language level as the discussion goes on. I just don't think the large-system, strong static safety guarantees design space is something that Julia will ever be perfectly suited for. To do that right it needs to be a design consideration from the very start, and it leads you to something that looks a lot like Ada or Rust.

We should probably recommend avoiding direct field access and using accessor functions in general, but still consider it OK for some types to define an official fields API and tell users to use them directly. Indeed, for most types you shouldn't play with the internal representation, but for some, like sparse matrices, the code sometimes needs to be tightly related with the representation to be efficient. There would be little benefit in making it use accessors, as it only makes sense when applied to this particular representation of sparse matrices.

But I still think it's worth recommending avoiding direct field access in most cases, i.e. "in absence of strong reasons against it, provide users with accessors; only require them to directly access fields for operations that are necessarily dependent on the internal representation of the object".

With getfield overloading, access to fields using the dot notation is no longer mixing interface and implementation. The implementation can change in backwards compatible ways by simply overloading the field. If you want to add some error checking later, just overload the field and add the error check before returning the field. The nice thing is that you _a priori_ do not need to write getters and setters. Only if they are later deemed necessary they are added and it is backwards compatible. Python seems to do perfectly fine using this paradigm. Not that we should strive to be Python but having to add six getters and setters for something that is basically a storage type because "fields are private" is soul crushing.

@KristofferC: Julia does not allow field overloading so this is just speculation. Properties are what seem to block #1974 because it will mix size(x) and x.size (see Jeffs comments in that thread)

I don't think anything is blocking #1974 other than implementation effort of properly finishing #5848. It will happen, though possibly not in time for 0.4, we'll have to see.

If this is so clear then this issue and #12083 are obsolete and can be closed.

I don't think that dot operator overloading resolves this entire issue. When I write code, I establish an interface (through documentation, and possibly aided by appropriate language features). That is my contract with the user. I want to make it as small as possible so that I am free to change the implementation afterwards.

With dot operator overloading, it will be a lot safer to establish an interface that includes field access. But for my implementation, I should still be able to use fields that are private (in the sense that I am free to change their meaning or remove them), without jumping through hoops such as creating dummy getfield overloads that throw an error when trying to access them.

@toivoh: This is all correct and fine. But until these things are landed it does not make sense to document the privacy of fields as it simply does not hold anymore after #1974. This issue is (as the title indicates) is about privacy of fields.

In case it helps.... the following Discourse comment provides and example of how to create private properties for a type.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

m-j-w picture m-j-w  Â·  3Comments

i-apellaniz picture i-apellaniz  Â·  3Comments

StefanKarpinski picture StefanKarpinski  Â·  3Comments

yurivish picture yurivish  Â·  3Comments

manor picture manor  Â·  3Comments