Rfcs: Crates should allow private impl of external traits for external structs

Created on 30 Nov 2014  ·  88Comments  ·  Source: rust-lang/rfcs

Issue by alanfalloon
_Thursday Apr 24, 2014 at 02:31 GMT_

_For earlier discussion, see https://github.com/rust-lang/rust/issues/13721_

_This issue was labelled with: in the Rust repository_


As you know, you can't provide an impl for a trait if neither the type nor the trait are defined in the current crate, you get:

error: cannot provide an extension implementation where both trait and type are not defined in this crate

For public implementations, this makes perfect sense, there are issues with collisions between implementations and sheer surprise. However, for a private implementation of the trait, this is a real hindrance.

Consider the case of std::path::Path not implementing std::fmt::Show. The rationale for closing #13009 is perfectly valid: we don't want people treating paths as strings. However, by refusing to add an implementation of Show, you have made that decision for all programs everywhere. In my case, I had a large struct with numerous Path elements that I wanted to print for debugging, but #[deriving(Show)] won't work because Path has no Show impl, and now I'm stuck either implementing it tediously from scratch, or switching to str for my path names.

The perfect compromise would have been to allow my crate to define a private Show impl for Path. There is precedent for this in other languages, go allows interfaces to implemented anywhere, for example.

T-lang

Most helpful comment

+10000
the promise of a more open-world environment is what attracted me to Rust, and when I discovered these restrictions I began to lose interest.. a future C++ with UFCS and modules will satisfy me more.

Changes like this would draw me back to the language.

All 88 comments

I'm very much in favor of this. It should be possible to extend existing types without having to wrap them in a unit struct, or define a new trait where it doesn't make sense to (e.g. adding a single helper method).

I agree that the current restrictions enforcing coherence (essentially global uniqueness for impl) are a hindrance. This is because coherence is fundamentally anti-modular.

Having globally unique instances is something of a devil's bargain: convenient at a small, local scale, but leads to a number of problems in the long run, seriously limiting how the language can evolve. This is the situation Haskell is in now, having embraced coherence: see here, and here. I'm a little concerned Rust is making a similarly unfortunate (and unnecessary) choice…

Distinguishing the _definition_ of an impl from _canonicalization_ of an impl would be a big step forward here. Explicit canonicalization would allow the programmer to specify in a more principled fashion which instances to use (from an external crate or local), without resorting to wrapper structs.

Aside from helping to restore some modularity now and retain flexibility going forward, explicit canonicalization opens to the door to more general and more powerful mechanisms for implementation resolution. For example, see the work on explicit unification hints here subsuming type classes, canonical structures, etc. Another well-known but somewhat less general approach using ML-style modules is modular type classes.

On the other side of this debate, of course, is Edward Kmett. He was going to give a talk about this, but it got cancelled - I hope it'll be rescheduled at some point!

@glaebhoerl I want to point out that I'm not advocating "implicits" or "ML-modules" per-se as an alternative to type-classes, so in that sense I don't think what I'm proposing is on the "other side" at all, but something else entirely. I would advocate an approach that does not require global uniqueness but would still allow the current behavior to more or less be recovered in a controlled fashion and as a special case without sacrificing modularity. I don't think Kmett addresses this possibility.

Having said that, I think Kmett has some great ideas and has done some really amazing things with Haskell but where coherence is concerned, I don't find his arguments convincing. To me, it basically boils down to "coherence is really convenient, and besides look at all we've managed to accomplish with Haskell type classes".

What I find lacking is a principled response to the alternatives, e.g., canonical structures, unification hints or other more general mechanisms, or the differences in the way type-classes are implemented in various theorem provers. Part of the reason for this is that it's difficult to compare the alternatives in a practical sense given differences in languages, libraries, and other factors. But that doesn't make the argument in favor of Haskell's approach any more convincing.

I also feel such arguments ignore most of the collateral complexity of embracing global uniqueness and trying to work around the consequences elsewhere: several random extensions relating to type-classes, issues with the proposal for the module system (backpack), generalized newtype deriving (with all its quirks), roles, etc. Global uniqueness only simplifies some things from a particular perspective. As a general language design principle, it's hard to justify (especially in retrospect!) and I believe definitely a case of "worse is better".

There are very real and well known pain points associated with embracing coherence, as the op points out. I guarantee these will only become more of a thorn in the side of Rust users as the language grows. It would be a shame for Rust to simply adopt the status quo on this rather than reconsider some of the decisions Haskell has made while it's still within the realm of possibility.

Furthermore, it seems to me (and I could be off base), that one of the principles behind Rust is to avoid making things implicit when having them explicit offers greater flexibility and control. The current approach ties two different concepts together, definition and canonicalization (globally), where the latter is implicit and doesn't need to be.

+10000
the promise of a more open-world environment is what attracted me to Rust, and when I discovered these restrictions I began to lose interest.. a future C++ with UFCS and modules will satisfy me more.

Changes like this would draw me back to the language.

@darinmorrison couldn't agree with you more. Need to look at Edward's comments more, but I am not sure where he could have gotten that opinion. By separating cannonicalization and definition, one can fine tune the coherence rules to exactly what they think they should be (which seems to very from person to person) without sacrificing the expressiveness of the language. Does anything more need to be said?

(As an aside, I started writing an RFC https://github.com/Ericson2314/rfcs/blob/master/active/0000-combine-mod-trait.md, which _would_ combine traits and modules, around the time of the associated items RFC. I was told that this was too much to do for 1.0 -- even more obvious at this point in time than then. To reiterate, I do see plenty of advantages to separating cannonicalization and definition, even if traits and modules are kept separate.)

@Ericson2314 interesting… I remember seeing comments about ML style modules somewhere but didn't realize anyone had started writing up an RFC. Even aside from this issue, it would be pretty awesome to have them in Rust if it were feasible. (Higher-rank polymorphism and existentials would also be great with HKTs but I agree it seems tricky with the monomorphization.) I would be willing to help with an implementation effort somehow. /cc @jonsterling

You can get a correct local version of the reasoning that Edward Kmett takes as the main advantage of type classes with the Modular Type Classes approach, albeit with some changes to the version described in the paper.

The original paper models type classes using ML modules with _generative_ functors (meaning that a functor applied to the same arguments gives fresh abstract type components in the resulting module) rather than _applicative_ functors (meaning that each distinct application gives identical abstract types in the resulting module). This implies that instance functors are required to be fully transparent, with no abstract type components. If they were modeled with applicative functors instead, you could use ordinary type abstraction to enforce Edward Kmett's desired guarantees from type classes locally.

An email to the TYPES mailing list and some slides explain this in a bit more detail. It's unfortunate that this has never appeared in a real programming language, since it really seems like the correct way to combine modules and type classes.

There is another issue that comes up with modeling type classes using ML modules that wouldn't really be a problem for Rust. Haskell type classes allow you to write programs that have an unbounded number of instances at runtime:

f :: Eq a => Int -> a -> Bool
f 0 x = x == x
f n x = f (n-1) [x]

main = do n <- getLine; print (f (read n) ())

Encoding this example with ML modules would require first-class modules, but with Rust you already encounter ad-hoc failure during monomorphization for examples like this.

[Edit: @jonsterling pointed out that I swapped generative and applicative in describing the original MTC paper, even though it was clear what I meant from later context.]

@zwarich Interesting! Furthermore, Since Rust doesn't allow impure consts / statics at all, I don't think we'd gain anything from generative functors.

My draft RFC proposes much of the functionality discussed in the slide show (arguments on functions being lifted onto the projection, etc), but I only hoped it would work out. Nice to learn from the slide show that it actually does (with pure applicative functors)!

@Ericson2314 If you want to see the semantics for an ML module system with both generative and applicative functors worked out in detail, the F-ing Modules paper by Rossberg, Russo, and Dreyer is interesting. They express all module system features by translation into System F_omega. If you want recursive modules with type abstraction, then you have to extend System F_omega to something like Dreyer's RTG.

Might be good to see if we can get @robertharper to weigh in on this, particularly on the question of whether or not generative functors make sense in a Rust-like language.

(Bob, if you're too busy and don't want to receive notifications on this thread, there's an unsubscribe button on the right side of the screen that you can hit.)

I haven't a clue about the context, but if someone wishes to summarize I could comment or make suggestions. I see that Modular Type Classes are being discussed, which I think are a good idea, of course, and cleaner and more general than the over-evolved mess in Haskell. In particular instance declarations in Haskell are wrong-headed; one should separate the instance itself, which is a functor, from its activation for use during type inference. And then one should be able to instantiate explicitly rather than always rely on the inference mechanism to trip over the thing you want, another problem area. In my view modeling type classes as run-time records is wrong-headed; instances are static, because they are inferred during elaboration. If you wan to pass actual run-time generated records or existentials, please do so, but that's not what type classes are for.

It may be helpful to look at our old paper on separate compilation for Standard ML. We thought long and hard about this, but no one paid any attention afaict, for largely "meta" reasons of the timing.

But maybe none of this is relevant to your discussion, so forgive me if not, and please fill me in if I can be of help.

@zwarich Can you elaborate on how type abstraction can be used to enforce Haskell-style “coherence” in certain cases? This is interesting.

@RobertHarper Hi Bob,

Thanks for popping in. I was wondering about the following:

In spite of the fact that Rust has more fine-grained structure than ML wrt mutability & references, I think that generative functors are probably crucial and shouldn't be ignored. It feels like applicative functors aren't actually that useful except when you consider them as enabling certain patterns of use with type classes as Derek's slides point out. Moreover, a module system with type classes can start off with only generative functors and be perfectly useful; then applicative functors can be added later to sort of patch over the areas that were slightly more difficult to use. Bob, do you have any comments or corrections to what I have said above?

hi john, not sure i understand "fine-grained srtucture" in rust, so it's hard for me to answer. i agree that applicative functors are a theoretical curiosity, and a bit of a hack in ocaml (they use an approximation of code equality, violating repind, to distinguish Set(X).t from Set(Y).t. There's a decent theory of it now, done by Russo, Dreyer, and Rossberg (in some order), but they are of marginal use in any language with state, certainly, and even without. The only argument for applicative functors was to have useful higher-order functors, but that never panned out. The only reasonable solution for higher-order functors is essentially what Tofte and MacQueen proposed a long time ago in the yucky Definition style; this may or may not be what was used by Dreyer, et al in their paper (I don't recall and don't have it to hand.)

BTW, I don't understand why generative functors interfere with type classes, can you explain? As long as you propagate that "type t = int", etc then the instance is going to be type-transparent, regardless of the functor being "generative".

BTW, I'm still not sure that I like the terminology "applicative vs generative functors", because it's not really about the functors, but about tracking the effects using type abstraction.

Bob

Re type classes and gen. functors, I think you are right, and I seem to recall that we had a similar conversation at one point where I raised a concern about generative functors and type classes, but I had forgotten it. Honestly, I'm not really sure of the point of having applicative functors in Rust...

I wouldn't know the designer's motives, of course, but I highly doubt there's any technical justification. Putting it the other way 'round, I'd like to hear the justification for it in Rust.

@RobertHarper Excusing the use of the terms 'generative' and 'applicative', doesn't it make more sense for instance functors in the Modular Type Classes approach to be applicative rather than generative? The paper uses generative functors, and this forces instance functors to be transparent, so that distinct applications of the functor that arise from elaboration produce compatible types. Using applicative functors instead would allow for transparent instance functors. That's what Dreyer argues in those slides I linked, and it makes sense to me. Am I missing something?

Type classes are by nature transparent; they cannot be opaque, because if they were, they'd be useless. (See PFPL for a discussion.) I don't see any advantage to applicative functors for anything, and not for type classes, unless I'm missing something. All that applicative functors do for you is to make abstract types more often equal than they would be without them. I don't see the relevance to type classes at all.

Bob, thank you for coming here.

First things first. Rust does not allow any effectful top-level (or impl-level, the closest thing to module-level) definitions. (Functions defined here can of course have effects, but the definition of function themselves is effect free.) Put differently, all static initialization is effect-free.

This leads me to believe that were functors added to the Rust of today, they would need to be effect-free, "generative" vs "applicative" semantics aside. Does anybody disagree?

No, that is irrelevant. Even in a totally pure functional language, you want only generative functors, because you do not want to confuse, say, posets ordered two different ways. The one-liner is that type abstraction is an effect. That's why existentials combine open with bind.

To reiterate, you never want applicative functors, and never ever these only.

Robert Harper
(from handheld)

On Dec 9, 2014, at 12:40, John Ericson [email protected] wrote:

Bob, thank you for coming here.

First things first. Rust does not allow any effectful top-level (or impl-level, the closest thing to module-level) definitions. (Functions defined here can of course have effects, but the definition of function themselves is effect free.) Put differently, all static initialization is effect free.

This leads me to believe that were functors added to the Rust of today, they would need to be effect-free, "generative" vs "applicative" semantics aside. Does anybody disagree?


Reply to this email directly or view it on GitHub.

Mmm sorry, by "they would need to be effect-free" I meant that generative functors in Rust would be able to do only what generative functors do in a "totally pure functional language", i.e. no incrementing of global counters or anything like that. My intention was not to squash all discussion of generative functors. I see that "effect-free" is really the wrong term for what I meant.

well, yes, there would be no storage effects, but there are nevertheless abstraction effects.

Robert Harper
(from handheld)

On Dec 9, 2014, at 13:06, John Ericson [email protected] wrote:

Mmm sorry, by "they would need to be effect-free" I meant that generative functors in Rust would be able to do only what generative functors do in a "totally pure functional language", i.e. no incrementing of global counters or anything like that. My intention was not to squash all discussion of generative functors. I see that "effect-free" is really the wrong term for what I meant.


Reply to this email directly or view it on GitHub.

Now that I have a keyboard .... Even if Rust is effect-free at initialization time (but then what does initialization consist of if not effects?), so-called generative functors nevertheless introduce a top-level effect, namely type abstraction. Actually, even structure declarations with a "sealing" effect introduce such effects, so it's not really about functors, it's about abstraction. The only reason the discussion becomes about functors is that it is of paramount importance to be able to distinguish statically between different instances of a functor, even if the type parts are the same, because the instances proxy for the difference of interpretation provided by the code attached to the type. So Posets of natural numbers ordered by LT should be distinguishable from those ordered by GT. This is not a matter of effects, notice, but in the presence of effects it becomes even more important, eg one wishes to distinguish statically between two different instances of a hash table, even if all the parameters about keys and values are the same. This example is about state, of course.

Historically, the invention of applicative functors was a dead-end attempt to have a useful notion of higher-order functor, which does not fall out easily from standard type theories of modules. (Nor is there a case for higher-order functors even being very useful.) Even with so-called applicative functors, you still don't get a very useful concept of higher-order modules.

Standard ML modules are a local optimum that cannot easily be improved upon without making very large changes and introducing considerable complexity. The main improvements that I can envision are modular type classes (mostly to keep up with the Peyton-Jones's, but they can be handy in limited circumstances), and my re-working of data types to better integrate with modules. The latter is definitely worth doing, for a host of good reasons, and is the most significant improvement that I know about to the SML module system.

Does Rust have data types in the sense of SML? If you're doing them like in SML or Haskell or O'Caml, then I claim you are doing them wrong, and I know how to do them right.

Bob, do you have a paper or any literature that summarizes your re-working of data types? If I recall correctly, the idea was to support something like this:

signature OPTION = data
  type 'a t
  con some : 'a -> 'a t
  con none : 'a t
end

As a result of the special declarations the signature also exposes pattern matching somehow. My questions: does this signature give rise to a default standard implementation structure Option : OPTION or would you write it yourself? Is the induction principle for 'a Option.t given in terms of the sum of all con declarations within Option's signature OPTION? How do data signatures/structures compose wrt. constructors and pattern matching?

the design is implicit in the harper-stone semantics given in the milner volume, and is sketched explicitly in my ml workshop talk, which is available on my web page. i haven’t taken it further, but between the two i feel confident that it can be made to work out. the main ideas are as follows:

  1. introduce a “data signature” of the kind you describe below.
  2. introduce “data structure” declarations of two forms
    2.a. data structure S : SIG, where SIG is a data signature, which gives the default implementation
    2.b. data structure S : SIG = struct … end, where SIG is a data signature, which allows you to implement it yourself in whatever way you like.

the tag “data” on “structure” makes ‘a S.t available for pattern matching. in 2a it also indicates what it means to have a default implementation (as a recursive type). when you implement it yourself, there is trouble if you use effects, because the pattern compiler re-orders things to optimize pattern matching. a modal distinction would be helpful here, so that pattern-matching implementations could be required to be pure.

there are details to work out that will no doubt complicate matters, such as mutually recursive data types, or gadt’s, which are a hack in my view anyway, so maybe i don’t care about that so much.

there are opportunities to combine this with modular type classes to get a general form of “deriving” in haskell (one that works more broadly), and for providing special syntax in a non-ad hoc manner.

bob

On Dec 9, 2014, at 16:55, Jonathan Sterling [email protected] wrote:

Bob, do you have a paper or any literature that summarizes your re-working of data types? If I recall correctly, the idea was to support something like this:

signature OPTION = data
type 'a t
con some : 'a -> 'a t
con none : 'a t
end
As a result of the special declarations the signature also exposes pattern matching somehow. My questions: does this signature give rise to a default standard implementation structure Option : OPTION or would you write it yourself? Is the induction principle for 'a Option.t given in terms of the sum of all con declarations within Option's signature OPTION? How do data signatures/structures compose wrt. constructors and pattern matching?


Reply to this email directly or view it on GitHub https://github.com/rust-lang/rfcs/issues/493#issuecomment-66365332.

(Sorry to butt in, via Twitter).
@RobertHarper wrote:

I don't see any advantage to applicative functors for anything, and not for type classes, unless I'm missing something.

To summarize @zwarich's point, Derek Dreyer agrees that the classical motivation for applicative functors are not interesting, and makes a different point in the links. That point is _not_ part of the paper on F-ing modules (by Rossberg, Russo, Dreyer) which you cited, and it doesn't seem to have been published.
Those links contain answers to that question, and to your other questions:

BTW, I don't understand why generative functors interfere with type classes, can you explain?
[...]
I wouldn't know the designer's motives, of course, but I highly doubt there's any technical justification. Putting it the other way 'round, I'd like to hear the justification for it in Rust.

A summary of Dreyer's point follows (as a separate, longer comment), in case it helps.
(EDIT: formatted quotes more accurately).

Dreyer's canonical example is the MkSet functor: it's a natural candidate for being an instance functor, but it's not transparent, so you need to apply it explicitly even to instances you make canonical — to which Dreyer comments:

This is quite cumbersome. Aren’t modular type classes supposed to apply your functors for you?

Here's MkSet from his slides:

signature ORD = sig type t; val cmp : t → t → bool end

signature SET = sig type t; type elem;
  val empty : t;
  val insert : elem → t → t …
end

functor MkSet (X : ORD)
  :> SET where type elem = X.t
  = struct ... end

To answer @RobertHarper's point:

Even in a totally pure functional language, you want only generative functors, because you do not want to confuse, say, posets ordered two different ways.

Dreyer agrees, but provides a different solution (which however does appear in the F-ing modules paper, at least in the journal version). If it weren't for decidability, you'd just want to compare the ordering functions with contextual equivalence. To approximate that, instead of adding stamps to the result of (generative) functor application (which IIRC is one way of understanding generative functors), you can attach those to value declarations (at least, "moving the stamp generation" is how I understand this, he doesn't say it), so that applicative functors will distinguish posets ordered in different ways. From the email:

In Chapter 2 of my thesis, I argued that the right answer to the
question [of comparing the code in functor arguments] is "contextual equivalence". [MkSet(IntLt).t] and [MkSet(IntGt).t]
are compatible types so long as IntLt and IntGt are contextually
equivalent. Of course, in general, contextual equivalence is
undecidable, so we must look for a conservative approximation. But
it's not hard to get a reasonable conservative approximation of it by
attaching "stamps" (i.e. hidden abstract types) to each value
declaration. If the language had real dependent types, that would be
even better, as Set(IntLt).t is really a dependent type. But let's
sidestep dependent types for now.

As he points out (in the slides), this only works for pure applicative functors.
And compared to Ocaml, these stamps are robust to eta-expansion. That is, if you define MyIntLt = IntLt, you still have MkSet(MyIntLt) = MkSet(IntLt) != MkSet(IntGt), unlike in Ocaml where MkSet(MyIntLt) != MkSet(IntLt) != MkSet(IntGt).

I think "semantically we'd want contextual equivalence, but can't have it" should be a compelling argument. I'm not sure I like this conservative approximation, but it's certainly less conservative than pure generative functors, and I don't see downsides — Dreyer claims it solves at least all the problems in Ocaml.

There's a decent theory of it now, done by Russo, Dreyer, and Rossberg (in some order), but they are of marginal use in any language with state, certainly, and even without.

By looking at Dreyer's page, that paper must be the already cited F-ing modules (journal version).

On terminology, they also write:

Hence, from here on, when talking about functors, we will use “applicative” interchangeably with “pure”, and “generative” interchangeably with “impure”. (In fact, the correspondence is so natural and intuitive that we are tempted to retire the “applicative” versus “generative” terminology altogether. For historic reasons, however, we will continue to use the traditional terms in the remainder of this article.)

To be extra-clear: I don't mean to be polemic, I'm sincerely interested in @RobertHarper and @jonsterling comments/arguments on Dreyer's design (and everybody else of course, but we seem to already buy Dreyer's points). I'm especially interested because I'd like to use similar designs in other contexts.

Response inserted below:

On Dec 14, 2014, at 09:20, Paolo G. Giarrusso [email protected] wrote:

Dreyer's canonical example is the MkSet functor: it's a natural candidate for being an instance functor, but it's not transparent, so you need to apply it explicitly even to instances you make canonical — to which Dreyer comments:

This is quite cumbersome. Aren’t modular type classes supposed to apply your functors for you?

Here's MkSet from his slides:

signature ORD = sig type t; val cmp : t → t → bool end

signature SET = sig type t; type elem;
val empty : t;
val insert : elem → t → t …
end

functor MkSet (X : ORD)
:> SET where type elem = X.t
= struct … end

Each instance of MkSet has to be given a name, after which the type of sets is fully transparent. I don’t see what’s wrong with that, given that one wishes to use type abstraction to track the interpretation of the argument type.

To answer @RobertHarper https://github.com/RobertHarper's point:

Even in a totally pure functional language, you want only generative functors, because you do not want to confuse, say, posets ordered two different ways.

Dreyer agrees, but provides a different solution (which however does appear in the F-ing modules paper, at least in the journal version). If it weren't for decidability, you'd just want to compare the ordering functions with contextual equivalence. To approximate that, instead of adding stamps to the result of (generative) functor application (which IIRC is one way of understanding generative functors), you can attach those to value declarations (at least, "moving the stamp generation" is how I understand this, he doesn't say it), so that applicative functors will distinguish posets ordered in different ways. From the email:

In Chapter 2 of my thesis, I argued that the right answer to the
question [of comparing the code in functor arguments] is "contextual equivalence". [MkSet(IntLt).t] and [MkSet(IntGt).t]
are compatible types so long as IntLt and IntGt are contextually
equivalent. Of course, in general, contextual equivalence is
undecidable, so we must look for a conservative approximation. But
it's not hard to get a reasonable conservative approximation of it by
attaching "stamps" (i.e. hidden abstract types) to each value
declaration. If the language had real dependent types, that would be
even better, as Set(IntLt).t is really a dependent type. But let's
sidestep dependent types for now.

It's not a matter of decidability, it’s a matter of comparing code, and of even having the code to compare. When you use type abstraction to track interpretations, you are choosing to impose pro-forma distinctions _in anticipation of future alterations_. Things can be equal now, but not later; the pro forma distinction ensures that future opportunities remain open.
As he points out (in the slides), this only works for pure applicative functors.
And compared to Ocaml, these stamps are robust to eta-expansion. That is, if you define MyIntLt = IntLt, you still have MkSet(MyIntLt) = MkSet(IntLt) != MkSet(IntGt), unlike in Ocaml where MkSet(MyIntLt) != MkSet(IntLt) != MkSet(IntGt).

I think "semantically we'd want contextual equivalence, but can't have it" should be a compelling argument. I'm not sure I like this conservative approximation, but it's certainly less conservative than pure generative functors, and I don't see downsides — Dreyer claims it solves at least all the problems in Ocaml.

There's a decent theory of it now, done by Russo, Dreyer, and Rossberg (in some order), but they are of marginal use in any language with state, certainly, and even without.

By looking at Dreyer's page https://mpi-sws.org/%7Edreyer/research.html#publications, that paper must be the already cited F-ing modules (journal version) https://mpi-sws.org/%7Edreyer/papers/f-ing/journal.pdf.

As mentioned, to me it’s not compelling; it’s not even right.
On terminology, they also write:

Hence, from here on, when talking about functors, we will use “applicative” interchangeably with “pure”, and “generative” interchangeably with “impure”. (In fact, the correspondence is so natural and intuitive that we are tempted to retire the “applicative” versus “generative” terminology altogether. For historic reasons, however, we will continue to use the traditional terms in the remainder of this article.)

It’s true, the terminology does not fit the problem; I’ve previously encouraged avoiding it.


Reply to this email directly or view it on GitHub https://github.com/rust-lang/rfcs/issues/493#issuecomment-66914616.

I think this means that your mind is made up, so there’s no further point in discussing? If so, that’s fine, but I’d like to be clear.

On Dec 14, 2014, at 09:28, Paolo G. Giarrusso [email protected] wrote:

To be extra-clear: I don't mean to be polemic, I'm sincerely interested in @RobertHarper https://github.com/RobertHarper and @jonsterling https://github.com/jonsterling comments/arguments on Dreyer's design (and everybody else of course, but we seem to already buy Dreyer's points). I'm especially interested because I'd like to use similar designs in other contexts.


Reply to this email directly or view it on GitHub https://github.com/rust-lang/rfcs/issues/493#issuecomment-66914850.

Thanks for your answer!

I think this means that your mind is made up, so there’s no further point in discussing?

Not at all, I meant the opposite: I was curious to hear about your comments! I hadn't seen a direct response to Dreyer's approach in _previous_ comments.

It's not a matter of decidability, it’s a matter of comparing code, and of even having the code to compare.

"Comparing code" would be an obvious technique, but too intensional for Dreyer's stated goal IIUC.

When you use type abstraction to track interpretations, you are choosing to impose pro-forma distinctions _in anticipation of future alterations_. Things can be equal now, but not later; the pro forma distinction ensures that future opportunities remain open.

That's an interesting point I had not thought about. Thanks for bringing it up!
But this sounds like a decision a programmer might want to make (similarly to sharing constraints). I imagine one might do that by some form of sealing — and I know how to do that in Scala via singleton types.

val IntLt: Ord { type T = Int }  = ...
val MyIntLt: Ord { type T = Int } = IntLt //b is equal to a but this is hidden in the interface; here there is an implicit upcast
val MyIntLt2: IntLt.type = IntLt //b is equal to a, and this is visible in the interface, because this is the type with only one member. This is useful to encode sharing constraints
val MyIntLt3 = IntLt //here, it was decided type inference avoids the singleton type `MyIntLt3: Ord { type T = Int }`

There, type inference decided to automatically seal different values to make them look distinct, though you can be explicit and force the sharing.

As usual with subtyping, there are cases where upcasting is good for information hiding and cases where it throws useful information. You seem to be arguing that copying a value should choose information hiding — but always?

I'm not sure what'd be a design for this in an ML module system (and in Rust). AFAICS from skimming F-ing modules, the stamps are turned into type-level components, and this might enable standard sealing to just work. But I don't see a discussion of this issue in that paper. I should look at Dreyer's thesis to see if there's an answer, but a first look suggests this will take some time.

Each instance of MkSet has to be given a name, after which the type of sets is fully transparent. I don’t see what’s wrong with that, ...

It can't be wrong in a strict sense — one can program in SML. It's just about saving programmer work if possible — and that's what the work on "modular type classes" seems about IIUC, though it seems you and Dreyer have different opinion here. This convenience should still allow different bindings to count as distincts (as you point out), at least where appropriate, and shouldn't compromise too much other desirable properties or introduce excessive extra complexities. Right now, it's not fully clear whether one can reuse standard sealing (which should be already there) or one needs another variant of it. I asked Derek Dreyer himself, but let's see if he has time.

... given that one wishes to use type abstraction to track the interpretation of the argument type.

I'm not sure I follow. I guess "track the interpretation of the argument type" refers to tracking the value components of the argument structure, but I'm confused on "type abstraction" — aren't we abstracting on modules? Maybe that shouldn't make a difference. And why does that justify having to instantiate MkSet explicitly rather than implicitly?

I would also add that Set is not suitable for being a type class transformer precisely because it generates an abstract type, not a type class. Type classes are a very weak mode of use of modules; one cannot expect much from them. See PFPL for a discussion of type classes vs type abstraction.

Bob

Robert Harper
(from handheld)

On Dec 15, 2014, at 00:28, Paolo G. Giarrusso [email protected] wrote:

Thanks for your answer!

I think this means that your mind is made up, so there’s no further point in discussing?

Not at all, I meant the opposite: I was curious to hear about your comments! I hadn't seen a direct response to Dreyer's approach in previous comments.

It's not a matter of decidability, it’s a matter of comparing code, and of even having the code to compare.

"Comparing code" would be an obvious technique, but too intensional for Dreyer's stated goal IIUC.

When you use type abstraction to track interpretations, you are choosing to impose pro-forma distinctions in anticipation of future alterations. Things can be equal now, but not later; the pro forma distinction ensures that future opportunities remain open.

That's an interesting point I had not thought about. Thanks for bringing it up!
But this sounds like a decision a programmer might want to make (similarly to sharing constraints). I imagine one might do that by some form of sealing — and I know how to do that in Scala via singleton types.

val IntLt: Ord { type T = Int } = ...
val MyIntLt: Ord { type T = Int } = IntLt //b is equal to a but this is hidden in the interface; here there is an implicit upcast
val MyIntLt2: IntLt.type = IntLt //b is equal to a, and this is visible in the interface, because this is the type with only one member. This is useful to encode sharing constraints
val MyIntLt3 = IntLt //here, it was decided type inference avoids the singleton type MyIntLt3: Ord { type T = Int }
There, type inference decided to automatically seal different values to make them look distinct, though you can be explicit and force the sharing.

As usual with subtyping, there are cases where upcasting is good for information hiding and cases where it throws useful information. You seem to be arguing that copying a value should choose information hiding — but always?

I'm not sure what'd be a design for this in an ML module system (and in Rust). AFAICS from skimming F-ing modules, the stamps are turned into type-level components, and this might enable standard sealing to just work. But I don't see a discussion of this issue in that paper. I should look at Dreyer's thesis to see if there's an answer, but a first look suggests this will take some time.

Each instance of MkSet has to be given a name, after which the type of sets is fully transparent. I don’t see what’s wrong with that, ...

It can't be wrong in a strict sense — one can program in SML. It's just about saving programmer work if possible — and that's what the work on "modular type classes" seems about IIUC, though it seems you and Dreyer have different opinion here. This convenience should still allow different bindings to count as distincts (as you point out), at least where appropriate, and shouldn't compromise too much other desirable properties or introduce excessive extra complexities. Right now, it's not fully clear whether one can reuse standard sealing (which should be already there) or one needs another variant of it. I asked Derek Dreyer himself, but let's see if he has time.

... given that one wishes to use type abstraction to track the interpretation of the argument type.

I'm not sure I follow. I guess "track the interpretation of the argument type" refers to tracking the value components of the argument structure, but I'm confused on "type abstraction" — aren't we abstracting on modules? Maybe that shouldn't make a difference. And why does that justify having to instantiate MkSet explicitly rather than implicitly?


Reply to this email directly or view it on GitHub.

@darinmorrison Just to be clear, I didn't mean to imply that I think @ekmett is right, and everyone else is not - only to note that when these discussions come around he is often there, considering it important to argue the other side. (Whether he _is_ on "the other side", whether there is one, or how many there even are, I'll leave for others to determine.) My perspective is only that I regularly see much smarter and more knowledgeable people than I debating this topic, and one point of view seemed to be going under-represented here.

That said, what I _do_ think would be important, whichever way we go, is that we don't just toss out the benefits of coherence without at least getting something similarly valuable (an ML-like module system?) in exchange. ("The restrictions are annoying, let's just get rid of them", without thinking more deeply about why they exist, as seemed to be going on in the other thread - _that_ would be a case of worse is better.) How we can have our cake and eat it too is an interesting question (the papers you mention have been sitting on my drive for some time now, (still) waiting for me to read them), but at the very least we should try to avoid the "neither" case.

That said, what I do think would be important, whichever way we go, is that we don't just toss out the benefits of coherence without at least getting something similarly valuable (an ML-like module system?) in exchange. ("The restrictions are annoying, let's just get rid of them", without thinking more deeply about why they exist, as seemed to be going on in the other thread - that would be a case of worse is better.) How we can have our cake and eat it too is an interesting question (the papers you mention have been sitting on my drive for some time now, (still) waiting for me to read them), but at the very least we should try to avoid the "neither" case.

Oh, I absolutely agree with you on this point in case it wasn't clear. Just lifting the restrictions without changing anything else would be a big mistake I think. I only wanted to draw attention to why they are in place to begin with and spur discussion of alternative approaches.

@glaebhoerl, @darinmorrison : I hope we're not intruding — if this should become OT (or did already), I'm happy to refocus or stop.

@RobertHarper :

I would also add that Set is not suitable for being a type class transformer precisely because it generates an abstract type, not a type class.

I see — even with associated types, you lack translucency — the type of Set wouldn't be abstract. Right? That extends indeed to "modular type classes", where instance functors are restricted to be generative and transparent, but Dreyer's example seems to escape those limits.
(Disclaimer: I did not read all of PFPL; I hope I've found the relevant explanations).

Yes, that’s correct. It’s wrong-headed to expect too much from the puny type class mechanism, which has been abused to within an inch of its life in Haskell to save a bad idea.

Bob

On Dec 15, 2014, at 16:16, Paolo G. Giarrusso [email protected] wrote:

@glaebhoerl https://github.com/glaebhoerl, @darinmorrison https://github.com/darinmorrison : I hope we're not intruding — if this should become OT (or did already), I'm happy to refocus or stop.

@RobertHarper https://github.com/RobertHarper :

I would also add that Set is not suitable for being a type class transformer precisely because it generates an abstract type, not a type class.

I see — even with associated types, you lack translucency — the type of Set wouldn't be abstract. Right? That extends indeed to "modular type classes", where instance functors are restricted to be generative and transparent, but Dreyer's example seems to escape those limits.
(Disclaimer: I did not read all of PFPL; I hope I've found the relevant explanations).


Reply to this email directly or view it on GitHub https://github.com/rust-lang/rfcs/issues/493#issuecomment-67067304.

@glaebhoerl My guess (based on followup comment), is that Edward Kmett would be perfectly happy if Agda instance arguments had global, not local canonization. The point being, I don't see anybody who is against separating canonization and definition.

Wouldn't global (canonization, or well anything) bring back the anti-modularity problems, which as far as I can tell is the thing people are primarily objecting to? (Sorry - I really should read those papers.)

@glaebhoerl It definitely introduces an ergonomics--modularity tradeoff, but not an ergonomics--expressiveness tradeoff. This is because one can still route the instances by hand without canonization.

I think Edward makes a good case about the benefits of global coherence. So why not support _both_ local and global canonization? Then everybody gets what they want! Like I think I've mentioned before (not in this thread) I view the canonization scheme as a tactic that can be, and should be, developed separately from the core language.

Let's assume we'd want to have something like modular type classes or explicit unification hints in Rust. This at least appears to be a wide consensus among the participants of this discussion. I have to admit that I can't completely follow the technical details here, but I begin to see the merits of adopting this.

However, I think it's fairly safe to say that it is way too late to add this for 1.0. On the other hand it looks like it can be added backwards-compatibly as it strictly adds functionality. Please correct me if I'm wrong about that. That is at least as long as one sticks to a syntax consistent with the current one and global canonization is still included.

So perhaps it would be good to steer this discussion towards making a decision about what exactly the proposed system should be: as far as I understand, it's mostly about modular type classes vs. explicit unification hints and local vs. global coherence. And then one should probably work towards a concrete proposal. It seems that unification hints certainly require some thought about the syntax.

@aepsil0n Absolutely nothing is happening before 1.0 for sure. Adding non-canonical instances should be backwards compatible, and hopefully doable during 1.x. I'd argue combining modules and traits entirely is more elegant, so I'd hope maybe 2.0 could do that.

So perhaps it would be good to steer this discussion towards making a decision about what exactly the proposed system should be: as far as I understand, it's mostly about modular type classes vs. explicit unification hints and local vs. global coherence.

OK.

I think Edward makes a good case about the benefits of global coherence. So why not support both local and global canonization? Then everybody gets what they want!

I suspect things are more complicated. I can't fully comment on Edward's points, I'm still digesting them, but I suspect he'd not like this combination.
With global canonization you lose the guarantee of modularity, so not everybody gets what they want — unless you pick the restrictions of Haskell against orphan instances and so on. Of course, technically it's possible to have two different mechanisms for similar things and let people pick, but often the different mechanisms interact badly, so that you need to create two versions of the same code using the different mechanisms. For this general phenomenon, many programmers have technical/aesthetic objections to such designs, calling them "design by committee".
Here, for instance, what happens when you want to combine a globally canonical class and a locally canonical one? Do you end up creating a global version of the local class? Or is local/global decided per instance, so that one library might prevent me from writing my own local instance? All of this needs careful weighting.
As another example, Scala allows something like what you describe: an instance can be declared in some specific locations (which make it available globally) or imported in a specific scope (that can override the global one). This preserves modularity but gives up coherence.
(BTW, I hope I did not violate the CoC, which I appreciate — I tried to minimize judgements and stick to facts, since it's not my opinion that counts).
I guess after this I'll switch to lurking — feel free to ping me if you like!

Forgive me if this is just because of my sheer ignorance (I haven't been able to follow this discussion at all, or understand how any of it is relevant), but I'm having trouble understanding what's so complex about this issue. If the problem is naming conflicts, wouldn't the simplest solution be to disallow naming conflicts entirely? e.g.:

use std::collections::HashSet;

impl HashSet<String> {
    fn foo(&self) {} // allowed
    fn insert(&mut self, val : String) -> bool {} // disallowed - name/signature conflict
}

// disallowed, since Show is implemented for HashSet in another place
impl Show for HashSet<String> {
    // ...
}

fn main() {
    let wl1 = HashSet::new();
    wl.insert("hello".to_string());
    wl.foo(); // allowed

    let wl2 = HashSet::new();
    wl.insert("goodbye");
    wl.foo(); // disallowed, wrong type (&str instead of String)
}

Or do what Go does and make the type keyword actually define a new type which inherits its definitions, and allows you to define new impls on them? For example, I just see no reason that the following shouldn't work:

use std::collections::HashSet;

type MySet = HashSet<String>;

impl MySet {
    fn foo(&self) {}
}

impl Show for MySet {
    // ...
}

fn main() {
    let wl = MySet::new();
    wl.insert("hello".to_string());
    wl.foo();

    // use MySet's implementation of Show, instead of HashSet's
    println!("{}", wl);
}

What makes this issue so complex?

@dead10ck At the basic level, it's about being able to guarantee that all code interacting with the HashSet will use the same hash (and equality) function, or conversely, to rule out the possibility that different code will use different hash functions with the same HashSet and thus violate its structural invariants. Global coherence of trait impls (elsewhere called "type class instances") provides this. The restriction against orphan instances is also necessary to prevent trait impl conflicts between different crates at link time, unless you also add some kind of link-time check; and the open-world assumption also tends to get in the way of other fun things (e.g. negative trait bounds, to name one, and possibly also impl resolution being able to take constraints on the impls into account, which we've already added). Even in Haskell which does allow them, the cultural norm against orphan instances is intense, and recommended best practice is to essentially never use them - so banning them outright (in exchange for benefits) is at least a step toward a local optimum. And one of the reasons why it's thorny is that most of the clever ways you might think of to be able to have your cake and eat it too (such as "private impls", or manual impl import/export) turn out to not actually work, and to allow the coherence guarantees to be violated.

With respect to the more advanced levels, such as alternative approaches like modular type classes or unification hints, and why Edward Kmett isn't satisfied with them, I only have an extremely vague sense, and it's probably better to look to the papers (as I haven't yet) and to pay attention to other people in this thread. (And perhaps to hope along with me that Edward's once-planned talk will get rescheduled at some near point.)

As regards your second example, having a new type be both "the same type" ("inherits its definitions") and also "not the same type" as another at the same time is not as trivial as it may appear. GHC's solution to this use case is a newtype deriving feature, for which we've had a couple of proposals here already. (Module-scoped existential types which I think ML has would also be an answer to the "both the same and not the same" part, in some ways, but I suspect not to the "being able to define a different trait impl" aspect.)

@Blaisorblade
First of all, "Then everybody gets what they want!" was a bit tongue in cheek. Based on that comment of his, I'd assume Edward would approve of separate definition + global canonization. I make no claims to that he'd like definition + global canonization + local canonization. Sorry I was not clear.

Yes, technically nobody gets what you want, because you loose Edward's safe refactoring and modularity by having both kinds of canonization. I opine the practical situation is not so bad at all, but it is a matter of fact that the two canonization tactics don't (only their ramifications do.) Impls are canonized completely independently, and when impls are to be inferred, it must be specified which tactic is used. (One optional exception would be to make a global canonization imply a local canonization, or equivalently make local inference fall back on a global canonical impl if there is no local canonical impl. But this intertwining is just that -- optional.)

I should mention that Go does _not_ allow you to implement methods on types that you did not own. You get the error "cannot define new methods on non-local type int" (for example). What the Go documentation calls "defining a method on an integer" really means "defining a method on a _newtyped_ integer", which Rust readily lets you do. The Go type keyword is really "newtype with a bunch of implicit conversions".

Rust is in fact much _less_ restrictive than Go, as it has named typeclasses. So in Rust you can _actually_ define methods on integers as long as your crate defined the typeclass.

So perhaps it would be good to steer this discussion towards making a decision about what exactly the proposed system should be: as far as I understand, it's mostly about modular type classes vs. explicit unification hints and local vs. global coherence. And then one should probably work towards a concrete proposal. It seems that unification hints certainly require some thought about the syntax.

My vote would be to explore unification hints. They are a lower level mechanism and may be easier to implement than the alternatives. If they were supported in the compiler, we should be able to experiment with a variety of approaches to canonicalization even if the hinting mechanism itself is not directly exposed to the user. They would likely even be useful in other areas of the language.

Like @Ericson2314, I would very much like to see ML-style modules too but I think that may be a more ambitious project since it would affect more parts of the compiler and could probably be done separately.

FWIW it's a classic mistake to think you can bolt on modules later. The design should start with modules, then build out. I don't know what would be "unification hints" but they sound algorithm-dependent, which is guaranteed to wind you up in a colossal mess, far worse than Haskell. Global consistency sounds anti-modular, which to me is a non-starter. Learn from the failures of oop and other anti-modular designs. I'd suggest to stop thinking of type classes as any more than a mode of use of modules; they're a cute hack that are of some marginal convenience when placed in proper context. I would suggest it's a big mistake to expect too much of them. Distorting beautiful things such as equationally constrained type inference in a desperate bid to somehow make type classes work would be, imo, very ill-advised.

Robert Harper
(from handheld)

On Dec 17, 2014, at 15:21, Darin Morrison [email protected] wrote:

So perhaps it would be good to steer this discussion towards making a decision about what exactly the proposed system should be: as far as I understand, it's mostly about modular type classes vs. explicit unification hints and local vs. global coherence. And then one should probably work towards a concrete proposal. It seems that unification hints certainly require some thought about the syntax.

I think this is a good idea.

My vote would be to explore unification hints. They are a lower level mechanism and may be easier to implement than the alternatives. If they were supported in the compiler, we should be able to experiment with a variety of approaches to canonicalization even if the hinting mechanism itself is not directly exposed to the user. They would likely even be useful in other areas of the language.

Like @Ericson2314, I would very much like to see ML-style modules too but I think that would be a more ambitious project since it would affect more parts of the compiler and could probably be done separately.


Reply to this email directly or view it on GitHub.

@pcwalton

What the Go documentation calls "defining a method on an integer" really means "defining a method on a newtyped integer", which Rust readily lets you do.

It doesn't really, though. To continue with the example I posted earlier, if I define a newtype like:

struct MySet(HashSet<String>);

I can't call any of HashSet's methods on a MySet object; I'd have to destructure the HashSet out of it first, or manually implement Deref and DerefMut for Myset, which is tedious at best.

@RobertHarper

FWIW it's a classic mistake to think you can bolt on modules later. The design should start with modules, then build out.

First I just want to point out that I'm not a core developer (or even a contributor at this point), just a rust user hoping to see some improvements. So I don't speak for them or anything like that.

I do agree with you that starting with ML modules as a foundation is a better way but the situation (as I understand it) is that Rust is basically in feature freeze now since getting the 1.0 release out in the coming weeks/months is top priority.

Basically, it's too late for ML modules or any other significant change until that lands. Backward compatibility is assured (as far as I understand) until the next major release which would be 2.0. Correction, see @steveklabnik's comment below.

Currently, rust implements traits which are like Haskell's coherent type-classes. Like many of the folks in this thread, I feel this is not optimal and I think you and I and many other commenters broadly agree on the reasons why. The question is, given the situation now, what is the best way forward?

I don't know what would be "unification hints" but they sound algorithm-dependent, which is guaranteed to wind you up in a colossal mess, far worse than Haskell.

The unification hints I'm referring to are described in the paper Hints in Unification. They are implemented in Matita and I think there have been experiments in Coq. I don't know if anyone else has used them.

I think they might make sense for a number of reasons. First, they would provide a structured mechanism for separating definition from canonicalization. Current behavior, for those who want or need it, could be retained through appropriate hinting (which could be automated).

What I find more compelling is that they can be used to implement canonical structures and other approaches beyond "coherent" type-classes—essentially paving the way for something better without forcing a significant change all in one step which would be a harder sell at this stage. I believe they could facilitate an implementation of modular type-classes if the module system was developed.

Global consistency sounds anti-modular, which to me is a non-starter.

Well, this is already what we have with the current traits. Some controlled mechanism for providing separate canonicalization (unification hints would be one) would at least give us a means to move away from global consistency requirements without total chaos. I don't expect it would be perfect but hopefully it would be an improvement.

Basically, it's too late for ML modules or any other significant change until that lands. Backward compatibility is assured (as far as I understand) until the next major release which would be 2.0.

This is not true, there is and will continue to be backwards incompatible changes, see http://blog.rust-lang.org/2014/12/12/1.0-Timeline.html for more details.

At the actual release of 1.0, you'd be correct.

(This is not suggesting anything regarding changing the module system or not, just pointing out an inaccuracy)

@steveklabnik Ah, thanks. Good to know. I corrected the statement.

@dead10ck We could implement "generalized newtype deriving" or even "deriving(Deref)" to fix that. I am totally opposed to going the Go route where you don't have type aliases at all, but instead only newtypes and then a random set of ad-hoc coercion rules.

@dead10ck We could implement "generalized newtype deriving" or even "deriving(Deref)" to fix that. I am totally opposed to going the Go route where you don't have type aliases at all, but instead only newtypes and then a random set of ad-hoc coercion rules.

@pcwalton Definitely agree with you on the second part but implementing "generalized newtype deriving" may not be so simple. I haven't looked at the proposals for Rust, but that feature has been and continues to be a significant source of complexity in recent GHC developments.

@pcwalton I would second @darinmorrison's comment. Useful as "generalized newtype deriving" is in a setting which has already been afflicted with Haskell-style type classes, it has been the source of no end of complexity. Consider the unsoundness that occurred a while back, and then how it was fixed by adding the exceedingly complex "roles" feature. I would say that a lot of good engineering went into making this stuff work in Haskell, but we should be careful to distinguish between those features which had to be added to make previous design problems (IMO) either safe or usable—we should distinguish between those features, and those which Rust might benefit from emulating.

Just my 2c.

I just realized reading a blog post by @nikomatsakis that modular type classes have been considered 3 years ago for inclusion in Rust but have been discontinued.

However, I do not quite understand, how the problems mentioned inevitably lead to the global uniqueness of an impl we have today. This also does not appear to be the conclusion of the discussion there. Niko, would you care to join in on this discussion?

This proposal, to allow private impls but not public impls of traits/types you didn't create, seems completely sensible to me. In addition to allowing local convenience impls, it also simplifies local experimentation with trait impls before upstreaming those impls.

Does this proposal have any downsides? Does anyone have interest in moving it forward, with a concrete RFC? I'd be happy to help anyone interested in writing one.

I've only skimmed some of the above, so maybe I've missed something, but..

These local impls make adding a generic impl a violently breaking change, as local impls can later fail irreparably. You can avoid that breakage if local impls override imported impls, but doing so creates another breakage where bug fixes in crates do not always propagate to dependents. You could compromise by allowing use to exclude some impls, but that's complicated, and still allows both breakage scenarios.

A priori, I'd think wrapper structs sounds like the right approach, especially coupled with delegation of implementation.

These local impls make adding a generic impl a violently breaking change, as local impls can later fail irreparably. [...]

Thanks for pointing that out; semantic versioning guidelines would need updating. Could you elaborate on why that's actually a showstopper rather than a tradeoff to think about? Dropping a local instance when upgrading a dependency doesn't sound hard and it happens already in Haskell (if you resort to orphans). If instances are different then you need wrapper structs.
Or am I missing some Rust-specific reason?

On the other hand, disadvantages of wrapper structs and delegation of implementation have been discussed above in quite a few posts. In particular, if you go for wrapper structs, and ignore the needed boilerplate, you risk needing something like https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/coercible.pdf.

Moreover, delegation of implementation sounds related to "generalized newtype deriving", which in Haskell needed further fixes for a bunch of soundness issues in combination with other advanced features (IIRC, mostly GADTs and type families, which Rust might or might not have). I'm not sure any of those apply (yet) to your proposal though, and wrapper structs/delegation are useful beyond a hack for coherence.

I'd imagine you'll get a better answer from someone more experienced, but..

Anything that makes reading or reasoning about the code harder sounds dangerous. In particular, a security audit retains its value much longer if the impl used gets determined explicitly by the call site and cannot be changed by code elsewhere in the program.

There was considerable opposition to adding an impl becoming a breaking change in past discussions of negative trait bounds https://github.com/rust-lang/rfcs/pull/1658 although more careful forms made progress https://github.com/rust-lang/rfcs/pull/1672#issuecomment-260059962. Also, impl specialization only changes default methods https://github.com/rust-lang/rfcs/pull/1210 while the final impl must visibly exclude any methods it wants pulled from defaults https://github.com/rust-lang/rfcs/pull/1268.

As an aside, there is a .0 member to unwrap a Rust wrapper like struct Mine<T>(pub T), while afaik you need a uniquely named accessor in Haskell, ala data Mine T = Mine { getMine :: T }, so maybe you're happier just unwrapping explicitly when you want impls that apply only to T.

Thanks for pointing that out; semantic versioning guidelines would need updating. Could you elaborate on why that's actually a showstopper rather than a tradeoff to think about? Dropping a local instance when upgrading a dependency doesn't sound hard and it happens already in Haskell (if you resort to orphans). If instances are different then you need wrapper structs.
Or am I missing some Rust-specific reason?

You're assuming that the impl defined in the dependency you're upgrading isn't materially different from the impl you've defined locally. If it does produce a materially different output that breaks the behavior of your code. For example, maybe you've locally defined Add on Vec<T> as a concat operator, because you think its so silly that Rust doesn't have that sugar, and then in std after years of debate we decide on some mathy operation to be performed instead. If you delete your impl and upgrade, your code will be totally broken.

But of course we could just accept that adding impls of existing traits for existing types is a breaking change. This works, except that right now there are a bunch of impls we don't have yet in std for various reasons (eg. any impl on an array of length greater than 32). So maybe someday, when we do think we have all the impls we want, you'd think we could do this.

The "philosophical" reason against this, even at that point in the future, is that we're trying (with some frustrating exceptions involving Send and Sync) to provide this guarantee: __adding new items to your system is never a breaking change__. Whether that's through adding a dependency or upgrading one, as you keep introducing new well-formed items the program keeps compiling. "Information monotonically increases." It keeps the upgrade story very smooth and upholds Rust's promise of "stability without stagnation."

People often lump this property into coherence, but coherence is about what happens when you change the compiler underneath the code, not changing the code itself. I've want to name this property "concordance" but I can't bring myself to finish a blog post.

Thanks for your comment. When I said "if instances are different you need
wrapper structs" I should have said "if impls are different", because of
the kind of scenario you describe. So agreed there.

Regarding monotonicity: that's an interesting guarantee, and one that's
often broken for somewhat boring reasons (unqualified imports). That would
mean that local impls have to override global ones, among the last options,
though this raises other questions on where the overriding applies. I
haven't looked

Regarding bug fixes not propagating: that's a concern, but wrapper structs
don't help at all with this problem. If a global impl is added and you want
to replace a local impl with the global one, just drop the local impl. With
wrapper structs you also need to delegate the impl (and now your wrapper is
pointless).
In general, if local implementations are forbidden, clients will still
implement what they need in some other way, and it will be more work to

turn that into the client of a global impl.

Paolo Giarrusso
From smartphone, sorry for typos or excessive brevity

Il 13 dic 2016 11:42, "withoutboats" notifications@github.com ha scritto:

Thanks for pointing that out; semantic versioning guidelines would need
updating. Could you elaborate on why that's actually a showstopper rather
than a tradeoff to think about? Dropping a local instance when upgrading a
dependency doesn't sound hard and it happens already in Haskell (if you
resort to orphans). If instances are different then you need wrapper
structs.
Or am I missing some Rust-specific reason?

You're assuming that the impl defined in the dependency you're upgrading
isn't materially different from the impl you've defined locally. If it does
produce a materially different output that breaks the behavior of your
code. For example, maybe you've locally defined Add on Vec as a concat
operator, because you think its so silly that Rust doesn't have that sugar,
and then in std after years of debate we decide on some mathy operation to
be performed instead. If you delete your impl and upgrade, your code will
be totally broken.

But of course we could just accept that adding impls of existing traits
for existing types is a breaking change. This works, except that right now
there are a bunch of impls we don't have yet in std for various reasons
(eg. any impl on an array of length greater than 32). So maybe someday,
when we do think we have all the impls we want, you'd think we could do
this.

The "philosophical" reason against this, even at that point in the future,
is that we're trying (with some frustrating exceptions involving Send and
Sync) to provide this guarantee: adding new items to your system is
never a breaking change
. Whether that's through adding a dependency or
upgrading one, as you keep introducing new well-formed items the program
keeps compiling. "Information monotonically increases."

People often lump this property into coherence, but coherence is about
what happens when you change the compiler underneath the code, not changing
the code itself. I've want to name this property "concordance" but I can't
bring myself to finish a blog post.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rfcs/issues/493#issuecomment-266705533, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AARsqI3BTUVYJFwYgAdM6IiXl-spWtYxks5rHnaCgaJpZM4DCEfZ
.

At least a wrapper structs makes the propagation of impls transparent. If you delegate then any relevant changes propagate for those traits. Any trait commonly needing local wrappers with tweaked impls can provide smartish default methods via #1210.

A person reading code carefully needs to identify the impl referenced by every call site. Anything that makes this less local makes (a) reading the code significantly harder, and (b) the value of that reading far more transient. If one can make a private impl for say From<T>, then it can override even the reflexive impl, so every call to into() becomes suspect, even in library code. As does every imported macro.

We worry not just about bugs here but hostile developers too. An attacker can find a change in an impl in a library, a call site in an application they wish to backdoor, and send a "refactoring" pull request that "accidentally" replaces the new impl with the old impl so as to create a vulnerability, but their pull can reference the old code from the library. And they can embed the hostile impl into a macro in yet another create.

Thanks for your comment. When I said "if instances are different you need wrapper structs" I should have said "if impls are different", because of the kind of scenario you describe. So agreed there.

But one doesn't know in advance if the impl you write will be different from the impl that doesn't exist yet. We're not going to require people to rewrite their code to use newtype structs in order to upgrade to a new version of Rust, that would definitely meet our definition of a breaking change.

That would mean that local impls have to override global ones, among the last options, though this raises other questions on where the overriding applies. I haven't looked

There are a few cases where name resolution is an issue (for example, adding a default method to a trait is this sort of breaking change). Since we have a canonical, fully qualified, unambiguous way to name any item, we could consider a breaking change of this sort (all you have to do is change the syntax you use to perform a function call).

In the abstract, would it make logical sense to say that:

  • if you depend on a specific version of an upstream library, then you are free to define impls whose legality depends on the fact that the upstream doesn't currently define any impl that would conflict with them;

  • on the other hand, if you depend on a minimum version of that library, then any impls you define must be compatible with any impls the upstream could potentially add? (i.e., @withoutboats's argument, and current Rust policy)

It feels like these things are connected, even if Rust doesn't currently let you make the choice. (Of course, it would probably violate separation of concerns between Cargo and Rust, which is why I'm asking "in the abstract".)

@glaebhoerl You mean orphan impls here?

Yes I think? Isn't that what the discussion's about?

Yes, I just woke up and I'm quite groggy.

If you want to depend on a specific version, then you can just "vender the dependency" to use Go speak. In that vein, you could maybe avoid violating separation between cargo and rustc by making the dependency live in your own source tree, and enforce the dependency on a specific version using git submodules.

As an aside, I do think Go's policy of expecting people to vender dependencies is kinda antithetical to free software, but this usage sounds reasonable. And git submodules stand out like a sore thumb too.

I probably missed something in the discussion, but if concern is that upstream might add own impl in a minor version, then:

1) This concern applies to any breaking changes in minor versions - e.g. you had local trait MyTrait in scope and implemented my_method for upstream type, but then upstream type defined own my_method in a minor version and your program doesn't compile anymore because of ambiguity. There is this fine line where we can only expect crate developers to be careful about semver, and not block new features on such concerns.

2) Even if we want to avoid that issue with impl specifically, why not just always prefer impl that is in local scope to any imported? Then calling trait methods would follow normal scoping rules and wouldn't come much as a surprise for end developer, and wouldn't break on upstream addition of impl.

@RReverser

On your first point, the key difference is that there is a fully explicit syntax you just need to transform to to fix the error - <_ as MyTrait>::my_method. However, even though its an 'allowed' change, I know this fact is considered when evaluating new APIs and is a reason people aren't quick to move methods from itertools to Iterator for example.

On your second point, that would violate the coherence property we're trying to maintain. It would be very bad to violate this property because of something called 'the HashTable problem'. For example:

mod foo {
    impl Hash for i32 { ... }

    fn f(mut table: HashMap<i32, &'static str>) {
        table.insert(0, "hello");
        ::bar::f(&table);
    }
}

mod bar {
    impl Hash for i32 { ... }

    fn f(table: &HashMap<i32, &'static str>) {
        assert_eq!(table.get(0), Some("hello"));
    }
}

You might get a different value out for the same key if you move the HashMap between modules/crates, because it will Hash with a different impl. This would be quite bad. And of course it applies to all kinds of traits, not only Hash. In general, when people write code they assume that if you call a method on the same value, you keep dispatching to the same place. Basically, methods ought to be type scoped, not module scoped.

In fact, there is a safer version of this that you can do in Rust @RReverser so long as all you want is convenience in your own crate. Just impl the trait for a wrapper struct like in this example. You can even derive traits for structs whose members do not impl the traits by wrapping those members too.

@burdges That's exactly the entire point we're trying to avoid with local derives (which is this issue about) - not having Wrap(x) and x.0.original_prop everywhere around the code just for the sake of passing it to compatible trait consumers.

For example, if upstream defines own implementation for u8, I don't really want to wrap my usages of u8 just to substitute that specific behavior - I just want to use my locally defined trait as I would with local functions vs imports.

You might get a different value out for the same key if you move the HashMap between modules/crates, because it will Hash with a different impl. This would be quite bad. And of course it applies to all kinds of traits, not only Hash. In general, when people write code they assume that if you call a method on the same value, you keep dispatching to the same place. Basically, methods ought to be type scoped, not module scoped.

@withoutboats I'm not sure this is a strong argument, this applies to any languages and to anything that can be named, not just traits - e.g. if I have same-named structure called IntNewType in two different types or same-named function write or using commonly named variable like i, it's obvious that moving it across files or even scopes might break the snippet.

Not really a trait issue in any way, just normal scope semantics.

You might get a different value out for the same key if you move the HashMap between modules/crates, because it will Hash with a different impl. This would be quite bad. And of course it applies to all kinds of traits, not only Hash. In general, when people write code they assume that if you call a method on the same value, you keep dispatching to the same place. Basically, methods ought to be type scoped, not module scoped.

@withoutboats I'm not sure this is a strong argument, this applies to any languages and to anything that can be named, not just traits - e.g. if I have same-named structure called IntNewType in two different types or same-named function write or using commonly named variable like i, it's obvious that moving it across files or even scopes might break the snippet.

Not really a trait issue in any way, just normal scope semantics.

@RReverser There's a misunderstanding about "move".
In the example above, in fact, a hashtable misbehaves just because it is created in one module and used in another one. Not because the code manipulating it is moved (though that would also cause problems).
Also, even if you're impl-using code across modules, the code you're moving never refers explicitly to the impl it uses, so it's harder to check if you're breaking things. Worse, if you write a polymorphic method baz that uses an impl for a type parameter, the impls can't be resolved right away—the client site that picks the type argument also specifies the impl to use, which might not be the one that was in scope in baz's definition.

In general, impls are just not lexically scoped.

In addition, you don't get the hashtable problem if names refer to different items in different scopes, you get a type error. e.g.

mod foo {
    struct Foo;
    fn foo() -> u32 {
        ::bar::bar(Foo)
    }
}

mod bar {
    pub struct Foo(u32);
    pub fn bar(foo: Foo) -> u32 {
        foo.0
    }
}

You don't get some undefined behavior hear, you just get a compiler error. And you can always call ::bar::bar in foo by using the fully qualified name to construct a ::bar::Foo.

But impls don't have names and aren't scoped. In order to scope them, you'd have to give them names, and then pass them around as well ass the types. In order for this to behave well it wouldn't be HashMap<K, V> but HashMap<K, V, H, E> where H is a Hash for K impl and E is an Eq for K impl, and whenever you take a hashmap, you have to specify which impls for those traits you're taking. It would be unmanageable.

In the example above, in fact, a hashtable misbehaves just because it is created in one module and used in another one. Not because the code manipulating it is moved (though that would also cause problems).

Oh right, thanks for explanation. Although not sure if that's a big problem? Exactly same applies to default fn implementations proposal after all where you can have one "default" implementation in module A, and own implementation in module B.

In general, impls are just not lexically scoped.

Yes, that's exactly what I'm suggesting to change.

In addition, you don't get the hashtable problem if names refer to different items in different scopes, you get a type error. e.g.

Not if it's a function or variable name or ...

E.g. when you have fn calc_hash in two modules with same types but different behavior, same concerns apply - that's why I'm saying this is not specific to traits and rather a normal scoping behavior, which we could apply to impls too.

@RReverser Specialization (default fn) is not lexically scoped, the impls form a module-independent ordering based on their types, as in Vec<u32> is more specific than Vec<T> is more specific than T.

Namespacing is not the same as what we're talking about. The important thing is that you can't pass something with one shape from one module to another just because they have the same name.

Specialization (default fn) is not lexically scoped, the impls form a module-independent ordering based on their types, as in Vec is more specific than Vec is more specific than T.

Ah fair enough.

The important thing is that you can't pass something with one shape from one module to another just because they have the same name.

@withoutboats Here you couldn't either, in both places object needs to implement same trait, even if specific implementations differ (just like with functions where you still must have same arguments&return types, but implementations don't matter).

That's not correct, the implementation is a part of the shape. HashMap only works if the Hash implementation for the key is the same every time its called.

What you're describing is not lexical scope but a very implicit kind of dynamic scope, because hashing calls inside of HashMap::insert or HashMap::get, which are not defined in these modules but inside the standard library, will call a different implementation of Hash depending on which module they themselves are called from.

HashMap can't call free functions from the module its called in unless you pass them in as arguments the same way it can with methods. Its totally different! Methods must be dispatched based on type and not scope, because they are passed around with the type.

Well I'm open to hear other suggestions.

Example: I want all [u8] in my specific file to implement Debug not as everywhere else, but to be displayed as strings. Obviously, I don't want to change all my definitions of [u8] to &str or some Newtype<u8>, as that will have avalanche effect on all the usages inside my module and by other dependencies.

Neither I want to implement custom Debug for each separate structure that uses u8 inside of my file, as that will require manually writing formatters for all the other fields too.

Note that this isn't specific to Debug - for example, serde has to implement custom attributes like serialize_with / deserialize_with on fields to workaround this problem too, and there are many other traits where it would be useful to override behavior for limited scope.

If speaking not about lexical scope, would it be possible, for example, to have custom trait implementations bound to owner struct?

Amazing this issue is still open after all these years. After searching around for a solution, this has led me here and I'd like to share my experience as a new Rust user. This problem keeps coming up again and again for me and it's so frustrating. The design choice seems backward in thinking to me. It made me want to give up on Rust because I'd get stuck for days trying to find a solution of which a lot are discussions of why this can't be done and the end is some sort of workaround. Rust as it is already has a high learning curve, why make people's lives more difficult? When learning a new language or anything for that matter, the experience should be smooth and the user should be able to keep progressing forward. This just stops dead the process. I hope there's a good solution in the pipelines to address this. Thanks.

@kintarowins as you can see, it's a difficult problem, and solving it the way being suggested (allowing private impls) is simply not possible. IMO, the most promising solution is to introduce a way to create a "strong type alias" - ie. a convenience for defining a local new-type wrapper around the foreign type and automatically inheriting some or all trait implementations from that type.

So far, nobody has submitted an RFC with a workable solution to "delegating" trait implementations in this way, but the language team have said on previous RFCs that this is something they would be interested in a general solution for.

Could someone concisely explain what the cost is for allowing private impls of traits?

My understanding is this:

  • If we allow private impls, then crates will be able to export conflicting definitions for the same pair of struct & trait, which can cause complex dependency issues.
  • If we allow private impls, then developers will be able accidentally couple themselves to a behaviour that isn't forwards compatible with the pairs of external structs & traits they are using.

Did I miss something else? Could someone help me understand why allowing private impls is "simply not possible"? (Said by @Diggsey - thanks for sharing this view btw). I don't disagree, I just want to make sure that I've really understood correctly.

Thanks,
Rob

I'm not exactly sure which suggestion Diggsey is responding to, as there are many ways to interpret "allow private impls" which are _possible_ (but still non-starters due to other massive downsides). What is true is that you can't "just" allow private impls without precisely specifying what happens in all the other crates. I'm guessing what Diggsey meant is that because of all the other constraints, the only plausible answer is "to other crates these impls effectively don't even exist", and then due to "the hashtable problem" discussed above (i.e. the fact that your trait methods may get invoked in someone else's crate) this wouldn't actually solve most of the reasons people think they want private impls or other orphan rule exceptions, because your private impl wouldn't even get used. See https://github.com/Ixrec/rust-orphan-rules for my best attempt to summarize all the constraints involved.

Semi-related, but we do have an RFC for delegation now, and it turned out a large chunk of delegation can be done entirely in proc macros (e.g. ambassador).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Ericson2314 picture Ericson2314  ·  69Comments

steveklabnik picture steveklabnik  ·  69Comments

mqudsi picture mqudsi  ·  73Comments

rust-highfive picture rust-highfive  ·  75Comments

steveklabnik picture steveklabnik  ·  183Comments