Go: sort: make sorting easier, add Slice, SliceStable, SliceIsSorted, reflect.Swapper

Created on 16 Aug 2016  ·  158Comments  ·  Source: golang/go

EDIT: the proposal has changed later in this thread. See https://github.com/golang/go/issues/16721#issuecomment-241085266 and https://github.com/golang/go/issues/16721#issuecomment-241221703

I propose we make sorting slices a first class concept in Go 1.8.

I think everybody appreciates the cute sort.Interface type at this point, but the reality is that:

  • nobody really sorts anything except slices (the handful of exceptions can keep doing what they're doing)
  • the Len and Swap functions are redundant. They're always the same for slices.
  • naming a new type (e.g. "widgetsByName") is tedious to many (including me; I still loathe it after 6+ years)
  • making a new type and three methods at the top-level (because it has to have methods, so it can't be a local type) is more difficult to read, since I'd prefer my Less function inline with my code, in context.

I think we should add to package sort at least:

package sort

// Slice sorts the provided slice using the function less.
// The sort is not stable.
// If slice is not a slice, Slice panics.
func Slice(slice interface{}, less func(i, j int) bool)

And maybe at the same time, or in the future:

// SliceSorter returns an sorter for the provided slice, suitable
// for with Reverse or Stable.
// If slice is not a slice, SliceSorter panics.
func SliceSorter(slice interface{}, less func(i, j int) bool) Interface

The assumption is that the user's less function would close over the data to be sorted.

For speed, this would not be purely reflect-based (except for one call to to get the slice length). The implementation would generate a custom swapper as needed for the type of the slice elements such that it's compatible with the GC & write barriers. I did a prototype of this which shows that it's as fast as the typical way.

I think there's plenty of evidence in the Github BigQuery dataset that we're making sort way too hard for users.

/cc @griesemer @robpike @ianlancetaylor @adg @broady

FrozenDueToAge NeedsFix Proposal Proposal-Accepted

Most helpful comment

In my experience, sort doesn't show up all that often, and it's easy to use as is. So I am not fussed about it.

As far as the proposal goes, it bothers me to add another function signature that takes interface{} to the standard library. It allows people to argue that we like them, and force people to use them and create them in their own work.

I do agree that the proposed function will be easy to use, although I'd like to see the implementation and its performance before deciding whether its benefits outweigh its costs.

All 158 comments

I'd be ok with something like this. It's pragmatic, small, and probably would reduce a lot of sort-related boilerplate.

I think there's plenty of evidence in the Github BigQuery dataset that we're making sort way too hard to users.

I don't have access to the mentioned above dataset, but let me guess: The need-to-be-sorted type declarations and its associated methods declarations constitutes less than 1‰ of code in that corpus. Perhaps far less. If that is true then I don't think this proposal is justified. Also, using reflection in sorting, however small it is, feels to me like encouraging bad practices.

@cznic, you do have access: https://github.com/blog/2201-making-open-source-data-more-available

We'll have to disagree on the metrics at which this is subjectively worthwhile.

No opinion on this.

Only thing I would say is that, sorting a slice is not exactly to be seen as simply sorting an array.
Reason being that if a slice has multiple subslices, sorting one slice will somehow change the others (in a deep value kind of way).

Or should we decide that sorting a slice returns a completely new slice ? (and thus allocate ?)

Just a data point. (I guess you discussed a similar issue when adding the append function but, just to make sure).

As the primary force behind golang at our organization, this and no generic max/min function are the two things I find embarrassingly hard to explain. I'm not sure what it's like at Shoreline Blvd but without fail when an engineer joins our team their mouth drops through the floor at their first sort. Subjectively, I would love this in 1.8.

I'm fine with adding something to the sort package, although of course func (i, j int) bool is not the function signature that most people expect.

@ianlancetaylor, the benefit of i, j int is that it lets you get at your []T's T or *T, if the values are too large to efficiently copy. And it fits with the existing Interface.Less signature. Plus it's not like we can do better, can we?

Oh, sure, I understand why you suggested it. I'm just commenting that it will still surprise people.

As much as this is complained about and as many slice types as I have sorted this has never bothered me, so I don't see the point. Boilerplate is never fun, but this pittance is far below the threshold of annoyance.

A go generate command to spit out Len and Swap, or even just an editor macro, seems more inline with the Tao of Go here.

I'm with @jimmyfrasche here. Why can't this be a go generateable tool?

If it's basically as fast as sort.Inferface, and is just an addition to the stdlib, not a language change.... why would anyone ever say no to this? Yes, it's annoying that it has to use interface{} that we try not to encourage people to use... but what other choice do we have? No one has come up with a better suggestion so far.

Put me down for a +1.

How would SliceSorter work? Wouldn't it have to define a new type, and return it? That's not possible in Go afaik.

I'd prefer usage of a []interface{} parameter (which would make the reflection unnecessary), but since that doesn't intuitively work on all slice types I know we're stuck with the proposed signature. +1 for saving the headaches anyway though.

To the go generate comments: It's still possible to use that method, this is only adding to the stdlib. But for newbies and rapid development use, go generate can be just as much of a pain...

How would SliceSorter work?

It would be a private type implementing sort.Interface. It would return Interface, per the example above.

@pbnjay I don't think []interface{} would be very convenient; it's not possible to go from e.g. []*MyThing -> []interface{} without allocating a new slice and copying every element (which would be way more annoying than writing out the Less/Len/Swap).

It's a bit unfortunate that it has to be func(i, j int) bool though, because frequently you'd have MyThing.Less(other *MyThing) bool, which means that you'd end up doing:

sort.Slice(mySlice, func(i, j int) bool { return mySlice[i].Less(mySlice[j]) })

But oh well :). I'd still take that over declaring a new type+3 methods just to be able to sort stuff.

@riannucci yup - that's what i meant by "doesn't intuitively work on all slice types" - maybe in Go 2 we can get a generic slice-interface type!

@riannucci, in my experience, the Less method doesn't already exist somewhere. This is for ad-hoc sorts where you want to write the Less function inline, in context with your existing code:

   sort.Slice(people, func(i, j int) bool { return people[i].Name < people[j].Name })

How about func Slice(slice interface{}, l int, less func(i, j int) bool), where l is the length up to which the slice is sorted. This means the common usage would just be calling Slice with len([]T), but saves the reflect. I'd wager it's possibly not worth it?

Edit: Nevermind, my bad, there will still be reflection.

@freeformz, you already have to do the reflect a bit anyway, so it doesn't save you anything to pass the length explicitly.

Following on from @jimmyfrasche's comment (but my follow up may indeed be the subject of another issue/a separate go-nuts thread)

A go generate command to spit out Len and Swap, or even just an editor macro, seems more inline with the Tao of Go here.

On the go generate suggestion.

This raises the question of whether to distribute core, oft-used programs that are go generate-ers.

As things stand, every go generate-er needs to be retrieved via go get (or some other means). If instead some core go generate-ers were distributed as part of a release then:

  • no further go get is required to do core things (like slice sorting)
  • we encourage people towards go generate
  • we avoid any reflection (per @cznic)
  • ...

There are of course some downsides here:

  • how to name these core-generators (to be relatively sure of avoiding PATH clashes with existing executable programs, we might need rather verbose names like goGenSort)
  • another step in the process of learning Go
  • ...

@mackross +1 for the missing max/min but Go also lacks an absolute value function for integers, which is also hard to explain.

Shouldn’t there also be a stable counterpart? Maybe even with a variadic signature like

func SliceStable(slice interface{}, less ...func(i, j int) bool)

so that one could do

SliceStable(people, byName, byAge)

to sort people first by name and else (when they have the same name/are equivalent) by age?

This only simplifies a trivial unimportant amount of code so I don't see it as more than, at best, a minor annoyance, regardless of frequency.

I'm sure Brad's implementation is a negligible hit on performance compared to the current way, but it would have to be used in function bodies to close over the slice like:

func something() {
  //...
  sort.Slice(things, func(i, j int) bool) {
     //...
   })
  //...
}

so you have to pay the cost of allocating the closure and building the generic sorter on each invocation of something, when you don't have to pay anything defining the methods yourself, other than the usual indirections interfaces involve. That's going to be negligible to the cost of sorting anything for all but the smallest of n, and, if it shows up in profiling, it's simple enough to rewrite it the old way, but this sort of code is generally discouraged by the stdlib.

If this is going for pure convenience, I'd prefer the version @ianlancetaylor hinted at where the comparator is an interface containing a func(a, b T) bool (though the runtime hit from that would certainly be less negligible, even modulo potential copying overhead, and it's even further afield of encouraged practice).

Maybe I'm just uncomfortable that this is at a weird level between convenient and practical that results in this awkward situation where you have to close over a slice that's being permuted under the covers.

All that said, if this goes in, I'll use it—I'm lazy and hypocritical!

I don't have any problem with this existing—it's the inclusion in the stdlib that gives me pause. I have absolutely zero reservations about this being a go gettable library: people could even start using it today. The excellent go4.org or even golang.org/x seem like fine places to put this, and if it gets used a lot without issues moving it to the stdlib would be fine with me.

Taking a key func that's allowed to return one of a list of types ((u)int, []byte, string) makes it look a bit more like sort functions in other languages that take a key func, and would let you later drop in faster sorts specific to the key types without adding APIs. Lots of downsides make that unpalatable (keyFunc'd be an interface{} in the signature, trickier internally, overhead of Less calling keyFunc twice, and can't write a custom Less to sort by two fields or whatever). Maybe someone sees a better variation of that idea that I don't :) so throwing it out there anyhow.

@jimmyfrasche,

so you have to pay the cost of allocating the closure

Even today with sort.Sort you have to pay the cost of putting a slice into an interface value, since a slice (1 pointer & two ints) is not 0 or 1 pointers (which is all that goes into an interface allocation-free). So it's not significantly different.

I'd prefer to keep this issue focused on the proposal and not the implementation, though.

When I read the title, I thought you were going to propose something like a built-in:

var s []t
sort(s, func(a, b t) bool {
  return a < b
})

But this is certainly more pragmatic. Saying that 1% of programs touch sort isn't a very good argument against including this. This is scoped to the sort package, so programs that aren't sorting won't care about it. Even more reason to include it.

I'd prefer to tolerate something more essential and less ugly bloating the standard packages.

Can we teach compiler to handle both cases:

var s []int
sort(s, func(a, b int) bool { return a < b; })
var s []GiantType
sort(s, func(a, b *GiantType) bool { return a.prio < b.prio; })

Also not sure if it's worth worrying about copy mechanics. Perhaps compiler could elide copies somehow? Inline the closure? If it's a pure function, why not. And another point is that if you worry about copying, sort does a lot of swaps anyway, so maybe change data structure.

But anyways, I don't like the idea of having indices if we can do generics in the compiler itself. Most built-in functions act as generic functions.

Sorry for semicolons, I do a lot of javascript lately ("semicolon free" language).

Another thought: recognize simple known types (strings, ints, floats) and use default comparison operator (less than):

var s []int
sort(s)

That's kind of obvious though.

@bradfitz I mentioned that, but you're right that I should have edited that portion out. I was trying to convey that the performance hit, while negligible, would have to be paid more often by design, but I ended up hemming and hawing to the point of incoherence. I apologize. It's not a terribly important concern anyway.

Biggest issue:

Why should this be in the stdlib? You could throw it up on go4 right now and move it to the stdlib later. People who want to use it could start using it today.

@jimmyfrasche Something like it has been at https://github.com/bradfitz/slice for a while.

I really like this idea. I'm also one of those that have almost never written anything different for 2 of the 3 sort interface methods for slices. And also have found it a bit embarrassing having to explain to new Go devs how much boilerplate has to be written to sort a slice. Usually they then ask why they can't just call a sort function with a single key function, like in other languages.

I agree that because it's a localized addition to the sort package that only makes doing a thing more straightforward, that there is 100% benefit here. People can still use the existing approach or use the new shorter approach when it makes sense. Let's fix warts.

@twotwotwo, I don't recommend using github.com/bradfitz/slice. I never updated it to be compatible (safely) with Go 1.5's GC. But yes, essentially something like that. I should update it and move it to go4.org. Still, I believe this is common enough to warrant being in the standard library and would improve the readability of code.

@nsf, let's stay on topic. Nobody (except perhaps you) is proposing language changes. This bug is about a library change proposal only. If you want to discuss a hypothetical language change, the best place is probably golang-nuts@, since it's out of scope for Go 1.

@bradfitz How is it offtopic? I was implying that this is a bad idea and you guys should add a built-in function instead. It's a pretty good place where custom code generation is justified. And saying that a language change is out of scope for Go 1 is not serious, there were a number of language changes in 1.x releases (1.2 added three-index slices for example). Adding a built-in function won't break anyone's code.

@nsf, we can discuss on golang-nuts if you'd like.

Shouldn't the values in the slice implement a comparison method?
The signature looks a bit esoteric in the initial example.

If you need another data point I second what @nsf said.

If you want to add real first-class support to Go for sorting, create a builtin how append works so that we don't have to close over our slice that we already passed, to the same function, and access it again by index.

This proposal is better than the current sort interface but not compelling enough to make me rewrite all of our sorters to use this new func.

I'm in favor of this proposal -- it's a focused, pragmatic improvement to a common use case.

The way that the "less" function closes over the slice and takes indexes as arguments is similar to how an existing function in sort already works: Search.

In my experience, sort doesn't show up all that often, and it's easy to use as is. So I am not fussed about it.

As far as the proposal goes, it bothers me to add another function signature that takes interface{} to the standard library. It allows people to argue that we like them, and force people to use them and create them in their own work.

I do agree that the proposed function will be easy to use, although I'd like to see the implementation and its performance before deciding whether its benefits outweigh its costs.

@bradfitz,thank you for putting this together and thoughtfully answering our questions. If you don't mind, I have a few more to ask:

  • Would you be willing to share your prototype reference implementation? I promise to be respectful in looking at it with the lens of experimentation.

For speed, this would not be purely reflect-based (except for one call to to get the slice length). The implementation would generate a custom swapper as needed for the type of the slice elements such that it's compatible with the GC & write barriers. I did a prototype of this which shows that it's as fast as the typical way.

  • You propose mutating a user-provided interface{} function parameter that represents some slice. Unless I am mistaking, this would become the first and only place the standard library mutates a bare interface{} value. Is that correct, and does that matter? Reflecting on a data interface{} for reading/describing is well-defined in the standard library, however.

I am on the fence about the proposal for concerns of hand-wavy standard library purity but not doctrinally inflexible — FWIW. What extra burdens of proof (e.g., documented behavioral contract) would mutating data interface{} require?

  • What is the intended behavior if a user passes in a type backed by a non-slice reflect.Kind?

You propose mutating a user-provided interface{} function parameter that represents some slice. Unless I am mistaking, this would become the first and only place the standard library mutates a bare interface{} value.

False: https://golang.org/pkg/encoding/json/#Unmarshal

What is the intended behavior if a user passes in a type backed by a non-slice reflect.Kind?

It's documented in the initial comment in this post.

Thank you. json.Unmarshal was so blatantly obvious it was under my nose. :-)

A beauty of the existing implementation is the sort method doesn't have any understanding or care as to what 'thing' it is sorting as long as it implements the sort Interface, so I hope you keep it :). I also think that the proposal here, will be easier to implement for many, so I hope we get it, too.

I love that someone on the go team is giving the sorting boiler plate some attention, but I do not look forward to more code depending on the empty interface. One of the main selling points of go for me is type safety, please don't go full python.

@rajder, there could at least be a perfectly accurate go vet check for this. It's something.

@bradfitz Oh, that would be nice. A build error is still better imho :)

I'm apprehensive towards. Because the typical use case for the SliceSorter requires a closure, I think how to use the function will not be readily obvious to those unfamiliar with the pattern. I would be comfortable with the change if it case with a strong and simply example in the GoDoc. Were it to be without an associated example, I would not favor the change.

Go was my first language, and the language which taught me closures. I'm imagining other students coming in to the community from a Java or C mindset and being confused. I realize catering to these people is a non-goal, but most Gophers agree one of Go's strength is it's simplicity of design patterns.

I agree with all of @bradfitz's reasons for it's inclusion, however.

@RobbieMcKinstry I think closures are an important enough feature of the language that Gophers can be expected to know it. See net/http's HandleFunc for example. I do agree that this function should have an example, though. Ideally, many more things in the Go standard library would have examples.

Not sure this is the place for this discussion, but would more examples for standard library code be accepted in the general case? Or is the documentation considered "complete"?

@RobbieMcKinstry #16360 is the place.

@bradfitz So long as we're using reflection, why not go all the way and make the comparison function an interface as well? So you'd have a Go type of:
func Slice(slice, less interface{})
and an "effective" type of:
func Slice(slice []T, less func(T, T) bool)

You're doing reflection already, so it doesn't seem like much of an addition to verify everything at runtime. I suppose the one concern might be whether you could make it as performant? I'd have to think more about that one...

As for the issue of copying large values, I agree with previous comments that simply having less take pointers (i.e., func(*T, *T) bool) would solve the problem and would have two benefits: a) you'd avoid the clunky closure and, b) you could reuse a comparison function over and over again (including comparison functions exported for consumers of your package).

I have a sample implementation that I made a while ago that does this, albeit without the performance optimization of doing up-front per-type function creation (it just does the expensive reflection every time and then forgets the results): https://gist.github.com/joshlf/61e711195709427058ae0295b21ffba8

Go doesn't support dynamic type or generics yet. T doesn't provide any type information for current go. So reflect.TypeOf(sort.Slice) can't return consistent type. If go doesn't provide generics, T must be consistent type. BTW, I implemented similar package few month ago.

https://github.com/mattn/sorter

package main

import (
    "fmt"
    "sort"

    "github.com/mattn/sorter"
)

func main() {
    s := []string{
        "2",
        "3",
        "1",
    }

    sort.Sort(sorter.NewWrapperWith(
        s,
        func(i, j int) bool {
            return s[i] < s[j]
        },
    ))

    fmt.Println(s)
}

Having the func takes T as argument will incur a lot of copying
for large Ts. I think that's one of the reason it's using indices.

@mattn - What do you mean by "consistent"? I'm not sure I understand the problem you're explaining.

@minux - I agree, but I think having a *T argument is better - it involves as little copying as an index, and is much cleaner and means that the comparison function doesn't have to be aware of the data ahead of time.

@anacrolix - Yep, although you'd need to verify that the types matched up first (e.g., that the first argument is a slice type, that the second is a function, that it takes two arguments, etc).

To flesh out the go generate idea a bit more I put together sortGen (I'm sure others have done similar things before me, nothing immediately turned up in my searches):

https://github.com/myitcv/sorter/tree/master/cmd/sortGen

Simply define an order function:

func orderByName(persons []person, i, j int) sorter.Order {
    return persons[i].name < persons[j].name
}

then run go generate which will generate a function with the signature:

func sortByName(persons []person)

The example is more complete and gives details on the rules etc.

sortGen would fall into the category of core, oft-used generators that I was trying to define earlier that could be distributed with Go, thereby obviating the need for an additional go get

I also dislike that we need top-level types to use sorting. I like the suggestion to eliminate this. Like @robpike I don't think that it happens often enough that I'm terribly concerned about the pure typing effort.

I dislike this particular proposal, because a) it suggests that using interface{} to circumvent the type system is a good way to implement generic functions, which will lead to people doing it much more than before and b) it uses "magic" that is approximately only available to the stdlib to do that, so it excludes people wanting to imitate this methodology. I'd consider sort to be a "normal" stdlib package (as opposed to, e.g. syscall, runtime or os/exec), one that shouldn't use tricks to work with the runtime and should be possible to write as a stand-alone package. I don't thinks that's feasible with this proposal, because as you pointed out to get similar performance you need to cooperate with the GC, which is only really feasible for stdlib packages that are released in lockstep with go versions.

In short: People will complain about the go authors wanting generics themselves and cheating the type system, but not allowing them to do the same.

As an alternative, it could be considered to provide this in sort:

type LocalSorter struct {
    Len  int
    Swap func(i, j int)
    Less func(i, j int) bool
}

func (s LocalSorter) Len() int { return s.Len }

func (s LocalSorter) Swap(i, j int) { l.Swap(i, j) }

func (s LocalSorter) Less(i, j int) bool { return l.Less(i, j) }

(the type name is… arbitrary. Couldn't come up with a good one in 15s). It's trivial, doesn't eliminate the same amount of duplication and it can be easily provided by an external library too, but it allows using sort for custom slices without needing package-scoped types at least, thus partially addressing the motivation for this proposal (and the part I, personally at least, care about).

If go goes the way of this proposal, I'd like to see the kind of magic it employs to speed up the interface{} workaround so that third parties can use it too (as I don't know how that magic looks, I don't know if there'd be a sensible API for that).

@Merovius I think I like this idea as at least a stop-gap. Trivial implementation is at https://play.golang.org/p/4_Z2G1Wd0d . It could be extended so that a sorter wrapper is generated once and reused with different Less implementations.

@Merovius I agree with @jonbodner, and I'd add that we already do this pretty consistently in the stdlib in encoding/decoding situations (json/gob/etc) - we do some expensive up-front work the first time a given type or certain set of arguments is used, and that generates data structures or functions that allow us to perform the same operation much more quickly for every subsequent call. We would presumably do something similar here.

Also, you mentioned "magic" - do you just mean using the reflect package, or do you think this would require explicit runtime support or something else truly "magical"?

@bradfitz Oh right, there will still be some reflection going on anyway since interface{}.

@joshlf see the original comment by @bradfitz:

For speed, this would not be purely reflect-based (except for one call to to get the slice length). The implementation would generate a custom swapper as needed for the type of the slice elements such that it's compatible with the GC & write barriers.

That last part is not easily accessible to outside people; it requires go-version specific code to stay "compatible with the GC & write barriers", I assume.

@Merovius I'll admit that I don't understand that statement well. I had assumed he simply meant that it would be a normal, well-behaved function that would play nicely with the compiler's emitted GC and write barriers? I take it I'm misunderstanding something?

I'd prefer to not get into an implementation discussion in this bug, but I sympathize with @Merovius's excellent comment. I do agree that sort should be a normal package that anybody can implement.

In my latest prototype last night, I added a method to reflect:

package reflect

// Swapper returns a function which swaps the slice elements in v.
// scratch must be a non-nil pointer to a scratch element.
// Use reflect.New(v.Type().Elem()).
// Swapper panics if v is not a slice, or if scratch is the wrong type.
func (v Value) Swapper(scratch Value) func(i, j int) {
   ....
}

Using that, no special code or knowledge is required in the sort package. (It has to return a swapper func instead of being Swap(i, j int) to amortize all the expensive reflect safety checks. And using the reflect API as-is is like 100x slower.)

It's currently just a bit slower, but my implementation of Swapper isn't as fast as it can be yet.

@bradfitz Any chance you could paste the code? I'm trying to come up with something myself, and I'd be curious to see what you had to do under the hood to get that to work.

Why do you need the scratch argument? Seems like Value.Swapper can allocate it itself and return a closure that refers to it.

@bradfitz would there be many uses for Value.Swapper aside from implementing this proposal? I can't think of any that aren't contrived. It feels a little weird to add something so specialized to reflect to achieve a specific goal elsewhere.

Is it possible to add more general mechanisms to the runtime and/or unsafe and/or reflect packages that would allow creating efficient, safe swappers that stay "compatible with the GC & write barriers"?

@ianlancetaylor, true, that's residual from an earlier version that didn't return a closure.

OK, so I made a prototype implementation that accepts a comparison function whose arguments are pointers to the slice elements, not indices. Here's the implementation:

The signature is:

// Sort sorts the slice using less as the comparison
// function. slice must be a slice type, and less must
// be a function which takes two arguments - both poiners
// to slice's element type - and returns a bool.
// Effectively, the type signature is:
//  Sort(slice []T, less func(*T, *T) bool)
func Sort(slice, less interface{})

The performance is actually decent - using this Sort function takes ~170% as long as sort.Sort in the stdlib (see sort_test.go linked above). Here are the exact measurements:

BenchmarkSort-4                   100000         13367 ns/op
BenchmarkNativeSort-4             200000          7612 ns/op
BenchmarkSortPerOp-4             5000000           374 ns/op
BenchmarkNativeSortPerOp-4      10000000           250 ns/op

Under the hood, the implementation does two key things: First, it generates and caches sort implementations by type (similar to, for example, encoding/json). Second, it uses unsafe to avoid expensive reflection logic. I'm not too proficient with the details of how to do this reliably, so there are probably issues, but I'm sure someone who's better at this could modify the implementation so that it is guaranteed to work properly (e.g., guarantee that the GC won't be confused by what we're doing).

Thoughts?

I really wanted to keep this thread focused on the merits and API signature and not implementation details.

But I'll say that func Sort(interface{}, interface{}) is a pretty terrible function signature (the grossness of interface{} grows super-linearly with the number of interface{} arguments) and ~170% as long seems too slow. I was already unhappy with ~115%.

Ah sorry, I didn't realize that.

Re: API - what's the thinking behind saying that two interface{} arguments are more than twice as bad as one (or, for that matter, worse at all)? We already have to do dynamic checking, so I don't see more dynamic checking as much worse than less - we've already thrown static compile-time guarantees out the window.

As for the performance, I was kind of assuming that people more skilled than I am and with better understanding of Go internals would be able to make it much faster :)

@joshlf

What do you mean by "consistent"? I'm not sure I understand the problem you're explaining.

func Slice(slice []T, less func(T, T) bool)

Go compiler doesn't recognize whether the first T is same as second or thrid T.

@mattn I mean with respect to the implementation, which does some really unsafe stuff under the hood (check out the source) such as treating the slice as a byte slice, treating the given function as a func(a, b *unsafe.Pointer) bool (which it isn't), etc.

Summary of problem

  1. Vast majority of sort.Interface uses a slice
  2. Have to define a new top level type
  3. Len and Swap methods are always the same
  4. Want to make common case simpler with least hit to performance

Summary of proposals
1: original proposal (@bradfitz)

  • pro:

    • simple well-thought out api

    • easy to use

    • not much slower than defining own type

  • con:

    • no static type checking on first argument

    • required to be in stdlib to take advantage of internals to work efficiently in addition to using reflect and unsafe

    • adds magic to sort package (obviated by (7), these are in order)

    • requires passing the slice and capturing it in the comparator by type or closure

2: have something generate the boilerplate, go generate or an editor macro (me)

  • pro:

    • simple

    • can do it now if you care and ignore it if you don't

    • zero runtime performance hit

    • no interface{}, no use of reflect or unsafe packages

  • con: awkward to use and you have to use/remember how to use an additional tool

3: Take two interface arguments []T and func(a, b T) bool (T could be a pointer) (@ianlancetaylor indirectly/me)

  • pro: easier to use than original for writing the comparator
  • con:

    • more interface{}s = even less compile time type checking

    • copying could be slow for large (non-pointer) T

    • overhead of calling the comparator through reflect is _very_ slow

4: Take two interface arguments []T and func(a, b S) bool (If T is a pointer S=T, otherwise S=*T)

  • Same as (3) but more complicated, though avoids copying large T.

5: add a new builtin (@nsf)

  • pro: could use magic for optimizations and type checking
  • con:

    • violates conservation of builtins

    • would have to implicitly import sort package

    • very specialized use case for a builtin: the bar is higher than that.

6: LocalSorter struct (@Merovius)

  • pro:

    • simple to use

    • simple to implement

    • could work with non-slices

    • could be used to implement (1)

    • no interface{}, no use of reflect or unsafe packages

  • con:

    • not as simple to use as (1), have to define swapper and call len yourself

    • not much shorter than just defining a type

    • the names of the struct fields can't be the obvious Len, Less, Swap since those also have to be methods

    • requires capturing the slice in the swapper and comparator by type or closure

7: Add a reflect method that creates an efficient swapper for reflect.Value containing a slice in order to implement (1) (@bradfitz)

  • pro:

    • provides the interesting bit of the implementation as something anybody could use

    • keeps the sort package free from magic

  • con:

    • how many uses are there for Swapper other than (1)?

    • adds a new kind of power to reflect

5 is too magic.

1(+7), 2, and 6 solve the problem and check all the constraint boxes, all usable separately. 6+7 could implement 1 trivially.

3 and 4 fail on performance, but could easily be a go gettable package. Both would be enhanced by (7).

(let me know if I got anything wrong and I'll correct it)
edits:

  • removed question mark on (1) after reading CL
  • added pro to 2 and 6 "no interface{}, no use of reflect or unsafe packages", as per @myitcv's suggestion
  • added con about having to capture slice to 1 and 6, as per @joshlf's suggestion
  • noted that struct fields couldn't be the obvious choices in 6

@joshlf, see https://golang.org/cl/27321 for the prototype with reflect.Value.Swapper. One of the three typedmemmove calls could be eliminated with not much work, so it should get a bit faster yet. I'll look into that tomorrow.

Re: @jimmyfrasche's question of whether there are uses for reflect.Swapper other than this proposal, I would use it (here) to slap similar interfaces on in-place radix sorts (instead of less you pass key func (i int) returning string or whatever key type the particular sort wants). You could also use Swapper to provide a random shuffle/sample or a sort.SliceStable. None of those are world-changing uses, maybe, but they're uses. ¯_(ツ)_/¯

CL https://golang.org/cl/27321 mentions this issue.

Do we have any sort of proposal for generic-but-homogenous slices?
That would solve the problem of having to copy a slice of (non-interface) type T to make a generic slice-of-type-interface, because the generic array would simply be referenced by a fat pointer, the same way that generic scalars are. (Or maybe we have an invisible proxy object to carry those.)

Making up a type notation for the purpose, where []... is the type of a homogenous generic slice, I would imagine then sort could be defined with a function signature something like:

lang=go func sort(sslice []..., less func (interface{},interface{}) bool) (rslice []...)

@praetergeek, as discussed above, language changes are out of scope for this bug and Go 1. This bug is about a library addition only.

@bradfitz Should it be reflect.NewSwapper(v Value) func(i, j int)? It's taking a reflect.Value, but it's creating a swapper. Even if it's not changed to a function, the name should change, at least. /bikeshedding (Should I make that comment on the CL/learn how to comment on gerrit?)

@twotwotwo I don't know if _another_ sort package counts as a separate use case :þ.

A generic permuter was the only other thing I could think of as well. If (New)Swapper goes in, it would be nice to add that to math/rand: save allocating an []int for shuffle.

There are a lot of algorithms that involve swaps, certainly, but I can't really think of any that _just_ use swaps, so there's going to be a lot of func doThing(slice interface{}, f func(various indicies) retType) if they do take advantage of (New)Swapper, so this API will proliferate.

@jimmyfrasche - thanks for summarising the various pros/cons above.

Might I suggest we subdivide point 2, because the pros/cons for go generate are quite different to an editor macro (for example)

As far as a go generate approach is concerned (in light of sortGen):

2.1: go generate

  • pro:

    • simple

    • can do it now if you care and ignore it if you don't

    • zero runtime performance hit

    • compile time optimisation possible via inlining, SSA

    • no magic, no use of interface{}, no need to go near reflection etc

  • con:

    • go generate programs must be installed separately (could distribute core generators with Go binaries)

    • calling go generate prior to go test etc is an additional part of development workflow

On the second con, I would argue it's a good thing to be encouraging people towards go generate (hence it's not a con to my mind), but I've separated that opinion from the point itself because I would guess most people don't (currently) use go generate.

To add to @nsf initial idea and after some discussion on r/golang ... if we could add an orderable refinement in go/types on primitive types, that would allow to define multiple relations of order on composite types.

The example that was given there that I found interesting is :

sort.Slice(people, func(p Person) string { return p.Name })
sort.Slice(people, func(p Person) (int,string) { return p.Age, p.Name })

It could probably be simplified further but that would allow to sort composite types by field.
Here, for instance, the second example would sort twice, by age first and then by name.

I haven't thought about what difficulties one may encounter in implementing this but I find the idea quite appealing.

You'll note that: people is of type []Person

Just another idea that could be easier to use and explain than an indice based solution, if implementable without risks.

@atdiar, this bug isn't about changing the whole feel of the sort package. I just want to make it easier to write Less functions, not change how sorting is done entirely.

@bradfitz I understand but I'm still proposing an alternative that would end up being similar to what you are proposing. I understand the difficulty about language changes, but if the current sort is really a problem, perhaps it would make sense to solve it completely.

Added "no interface{}, no use of reflect or unsafe packages" to 2 and 6.

Optimizations is an interesting point, but I don't want to add that unless it actually happens, because you can say theoretically the compiler could optimize this or that for a lot of things that will never be true in any language. If someone can confirm that it in fact does, I'll add it to 2 and/or 6.

Separating go generate from an editor macro is just editorializing for go generate, imo: everything that applies to one applies to the other. I mean yes it is great and good, but I am trying to keep that as neutral as it is possible for me in the interest of discussion.

@atdiar, do you want to open a new bug so we can track your alternative proposal separately?

Then we can have a meta bug titled "things about sort" with references to each.

@jimmyfrasche A suggested addition to your summary:

  • 1 - con: requires the comparator to be aware of (for example, by closing over) the slice to be sorted

@bradfitz it's not really a bug. It's just an alternate suggestion to the current proposal. The rest is up to the people.

I don't know if another sort package counts as a separate use case :þ.

@jimmyfrasche Eh, separate or not it's still a reason to expose Swapper. People have reasons to write stuff besides the stdlib qSort. (In my case, the reason's an up to 5x speedup.) It's nice if those packages can do what sort does, rather than sort having magic that might break with next Go release if I try copying it. There is non-sort stuff you can do that permutes arrays: maintain a heap priority queue more nicely than container/heap. Merge arrays. Get median or percentiles a bit faster than if you sort the whole thing.

I don't think you have to prove NewSwapper would get a ton of use, though; it seems like the right way to do SortSlice anyway. It keeps runtime magic in reflect, which already has a lot of it, rather than scattering it around the stdlib. It keeps to a minimum how much of the stdlib is available-to-Go-team-but-not-to-you magic. It isn't an especially nasty API, isn't unsafe, doesn't expose a version-specific runtime knob. I know there's a high bar for adding code to std, but if this'll be written anyhow, in this particular case it doesn't seem bad to expose.

@joshlf was on the fence about adding that since it's kind of just how it would have to work but decided to go for it

Also, added it to @Merovius's proposal as well since it applies equally—and noted that that proposal couldn't use Len/Less/Swap as the field names and have the methods to implement Interface.

FWIW, I whipped my "proposal" up as a tiny package for my own use and my own personal needs are 100% satisfied with this. This basically invalidates the last two motivations given by @bradfitz for my personal taste, as IMHO every project/organization/repo/whatever can just do this themselves if they need or someone can just provide one for general use (perhaps under github.com/pkg). All the usual arguments for living in the stdlib or the x/ repos (wide availability, stability, maintenance…) just don't apply for something this trivial, in my opinion.

This leaves the repetitiveness of Len and Swap. To that, I personally say ¯_(ツ)_/¯ and it doesn't justify the downsides of interface{} in this instance. And I also believe that the alternatives proposed here (that don't require a language change) just increase the downsides, as they require even more reflection and ignoring the type system and will thus make it even harder to explain to people why go hasn't generics but they also shouldn't roll their own.

With that I'll just trust the Powers That Be to do the Right Thing™ and will unsubscribe from this issue to not be tempted to add even more noise :)

@Merovius I like that method! I packaged it up neatly: https://github.com/davelondon/sorter

type person struct{ name string }

s := []person{{name: "foo"}, {name: "bar"}, {name: "baz"}, {name: "qux"}}

sort.Sort(sorter.New(
    len(s),
    func(i, j int) { s[i], s[j] = s[j], s[i] },
    func(i, j int) bool { return s[i].name < s[j].name },
))

fmt.Println(s)
// Output: [{bar} {baz} {foo} {qux}]

@Merovius, @gri and I both wrote that same thing. I thought it was still too tedious.

I even referenced it in my CL above (see the first and currently only comment at https://go-review.googlesource.com/#/c/27321/2/src/sort/sort.go@28)

... but if that and maybe reflect.Value.Swapper is all that comes out of this, I suppose that's better than nothing.

@twotwotwo those are all excellent points, and thank you for the additional examples.

My major concern is that all these mountains are being moved to avoid

type sorter []T
func (s sorter) Len() int { return len(s) }
func (s sorter) Swap(i, j int) { s[j], s[i] = s[i], s[j] }

which would allow everything you listed without having to use the reflect package at all.

Sure, it's inaesthetic and a bit of chore but it's not the worst thing. The proposal is nicer, but, while nicer is nicer, it seems like an awfully complicated solution for such a small problem.

If we're going to do this, then I'm all for exposing the magic so others can use it and the sort package itself remains unmagical.

My concerns with NewSwapper are/were:

  1. are we exposing the right magic? (Could something else even more general be exposed that would allow the definition of NewSwapper by anyone?)
  2. NewSwapper would be odd one out in reflect. It's the only thing that takes a reflect.Value and builds a machine for efficient performing a task.

Looking at the CL, the alternative would be exposing runtime memmove operations (presumably to unsafe) which is probably more evil than it's worth.

Implementation-wise, at least, NewSwapper has to be in reflect and certainly in the stdlib (barring exposing memmove or adding a new package just for this under reflect and factoring some private helpers in reflect to a shared internal package).

So, still a little uneasy about it on principle, but it seems to be the best solution so far, if we're going to do this.

@jimmyfrasche @twotwotwo To clarify something, I think that part of the pain of having to implement custom types is having to clutter up the package-global namespace with them. This probably isn't the place for this discussion, but in my personal experience, I've found myself really wanting to be able to define scope-local types _with methods_ to solve this. It makes the difference between getting to use a throwaway, local name (e.g., sortable) and having to clutter the package-global namespace with things like sortableWidgets or whatever. Also, being able to define types inside of functions _and_ have methods on those types would be awesome. It's very possible this has been discussed elsewhere and I'm just not aware of it (@bradfitz might know).

@joshlf the sort.Funcs type and/or @davelondon's implementation of the idea at https://github.com/davelondon/sorter avoid having to create a type. Not as simple to use as the proposed sort.Slice, but simpler and not limited to slices. If that's the only box you want checked, it checks that box.

Language changes should get their own issues. I remember it being discussed a long time ago, at least on the mailing list, but couldn't find anything with a cursory search. At any rate that would be Go2 territory.

@jimmyfrasche My point was simply that if what I mentioned is a significant part of the concern, just adding that to the language would fix it and make the other problems that people have complained about not as bad (since they'd be the only issues left). In my experience, the headache that Brad referred to in the first comment (which I have definitely felt) stems largely from the namespacing issue. Also, I disagree that this should be Go 2 territory. Supporting methods on local types feels more similar in size to something like being able to use v.method as a closure - it would only expand the set of valid Go programs (thus maintaining the Go 1 compatibility guarantee), wouldn't be a particularly big change to the compiler, and wouldn't fundamentally change how any normal tasks are done in Go.

Anyway, my point is simply that I do think this is relevant here because it's a possible alternative solution (although probably not the best solution proposed so far, I admit).

just adding that to the language

"just"

@bradfitz I know that statement sounds grand in the abstract, but in this case it's a relatively small addition, no? As I said, it feels similar (at least in terms of the experience of using the language) to adding the ability to use v.method as a closure. I can't speak for how much it would take to implement, of course.

I'm not discussing language changes in this bug. Let's chat on golang-nuts.

@bradfitz OK, sorry to clutter up the discussion. I'll move it there.

@jimmyfrasche

Optimizations is an interesting point, but I don't want to add that unless it actually happens

Consider https://github.com/myitcv/sort_inlining_example

main.go is largely a copy and paste of un-exported parts of sort/sort.go. Notice I've replaced use of sort.Interface with []string directly.

main_test.go is a copy of BenchmarkSortString1K from sort/sort_test.go

run.sh shows the setup.

README.md shows the output from run.sh (on my machine)

Notice the inlining that can happen when the use sort.Interface is dropped.

This leads to a ~33% speedup in this benchmark.

Separating go generate from an editor macro is just editorializing for go generate, imo: everything that applies to one applies to the other. I mean yes it is great and good, but I am trying to keep that as neutral as it is possible for me in the interest of discussion.

go generate exists and is universally available to everyone via the go tool. The usage of the command is uniform and well documented.

Whilst some editors may have support for macros of this sort, these macros are not universally available (i.e. in all editors) and almost by definition usage of these macros is not uniform (i.e. it's different in each editor)

Hence why these should be considered very separately in my opinion.

To start off, this builds on @Merovius' solution and point of view and tries to justify it, so if this clutters up the discussion for you, stop reading. Also, this is another rant against using reflect so if that clutters up the discussion, stop reading. This is also a rant against interface{} and generics in general, so if that clutters up the discussion, stop reading.

There is a precedent for a possible solution to this in my opinion, its sort.Search

read the godoc for details, but the general idea is that we use length and closure instead of doing generic stuff

a sort shorthand would probably look like

func Slice(n int, less func(i, j int) bool, swap func(i, j int))

LocalSorter just becomes an internal, private type.

an example usage

a := []int{5, 3, 4, 1, 2}

sort.Slice(5, func(i, j int) bool { return a[i] < a[j] }, func(i, j int) { a[i], a[j] = a[j], a[i] })

the fundamental reason I think this works is because this builds upon already existing semantics with sort.Search, it works for me because sort.Search works for me, while the proposed signature doesn't because it is basically C++/Java/C# STL sort

std::sort(std::begin(a), std::end(a), [](T a, T b) {
    ...
})

which only works there because of templates (half assed generics). In the method proposed so far, we circumvent generics using reflect, which, and this may only be because I have only been doing golang since january this year, seems like a serious antipattern to me, because for me interfaces in go have no business with type, only functionality.

By using reflect to make the round trip of getting a subtype, I think we are propogating the idea that this is how generics work in go, you send a super type and then get back the sub type using reflect.

Another advantage I can think of is that at least we are giving a choice to the user to generate a Swap using reflect or write one line of code.

However, I think generating methods using reflect opens up a wide array of problems of its own, because today its Sort, tomorrow it will be Search, and instead of generating Swap, we would be generating Less functions, then Lower Bound, Upper Bound, Min Element, Max Element and so on....

TL;DR: I don't think we can define functions in go with semantics of languages that have generics, because we don't. I think that sort.Search is one of the best implemented functions in the go standard library, and we should do things in the semantics introduced by that function.

I also think that if and only if we want to introduce a shorthand, we need another function definition, but if we want to only reduce boilerplate code, we should definitely look at go generate and @myitcv's sortgen, because irrespective of pros and cons, generating boilerplate code that you don't want to maintain is the fundamental reason go generate exists (in my opinion).

Also, more shorthands can be implemented, we should also implement a shorthand for sort.IsSorted and I propose that we add Greater, LessOrEqual and GreaterOrEqual to sort.IntSlice, sort.StringSlice and sort.FloatSlice. With them, lower bound in go simply becomes

sort.Search(n, sort.IntSlice(a).GreaterOrEqual)

and upper bound is

sort.Search(n, sort.IntSlice(a).Greater)

@suyash the only thing you said about your sort.Slice function that's wrong is naming the function Slice: since you're writing the swap function yourself it's not limited to slices. It's _more_ generic.

I like the proposal (although Merovius@ proposal is very good, too---I'm definitely going to use it myself), however I've one objection has kind of already been touched on: how do defend this proposal if future similar proposals are rejected?

For better or worse, a fair amount of gophers want min/max/abs/etc functions and I could see future proposals using this as their defense for adding a math.Min(interface{}, interface{}) function.

@myitcv well, yes, if the code generator writes a custom sorter instead of filling in the boilerplate and using the sort package the compiler can optimize it better. I didn't think anyone was going that far. I meant if the compiler was actually doing type specialization to desugar interfaces or something that it couldn't do with the closure or reflect versions. But, again, what you're saying applies equally to an editor macro (albeit few would want to maintain such a large macro even an editor with a nice macro language).

My point is that it doesn't really matter if the code is being generated by go generate or an editor macro or a robot from the future that appears in a shimmering flash, tells you not to worry, touches your monitor, and the code is just there on your drive and then the robot explodes into a fractal of rainbows and joy.

They're all equivalent to "something happens and then the code is just written for you and you don't have to deal with it". The specific mechanism isn't the relevant part.

@suyash I really like this, although we could make it more usable by having the function return a sort.Interface which could be used with any of the various functions in the sort package:

func MakeInterface(len int, less func(i, j int) bool, swap func(i, j int)) Interface

This would obviously be _super_ simple to implement:

type interfaceWrapper struct {
    len int
    less func(i, j int) bool
    swap func(i, j int)
}
func (iw interfaceWrapper) Len() int { return i.len }
func (iw interfaceWrapper) Less(i, j int) bool { return iw.less(i, j) }
func (iw interfaceWrapper) Swap(i, j int) { iw.swap(i, j) }

@joshlf FWIW: https://github.com/sermodigital/sorter :information_desk_person:

edit: helps if it's public now doesn't it?

@joshlf, yeah, that's what @gri and I were thinking recently. It seemed too tedious before the idea of a "reflect.Swapper". I think I'd put the swap before less, so the less literal could span multiple lines, and you could write:

     sort.Stable(sort.With(len(s), reflect.Swapper(s), func(i, j int) bool {
             if s[i].Foo != s[j].Foo {
                    return s[i].Foo < s[j].Foo
             }
             return s[i].Bar < s[j].Bar
     }))

(MakeInterface might be a bit long. I like With or Funcs, but len(s) is not a func)

@bradfitz I really like where this proposal is going, maybe allow nil for the swapper and call reflect.Swapper in sort.With if nil?

@rajder, it's tempting, but then sort would need to depend on reflect for only that one feature, and it is not supposed to. See comments at https://go-review.googlesource.com/#/c/27321/2/src/go/build/deps_test.go

Also, I'd prefer the type-unsafety (use of reflect) and panic at runtime if the argument isn't a slice to be explicitly opt-in if we go this route. Then the sort package remains pure.

@bradfitz Ah, I did not think about the added dependency, I completely agree.

Edit: Making reflect and the added risk of panic opt-in is also a very nice idea. I would personally always implement the swapper but it's nice to have options.

A minor thing I think will look funny to someone seeing it for the first time is:
sort.Sort(sort.With(.....

Would not the version of the proposal by @suyash look better, but with a different name? (like @jimmyfrasche pointed out)

sort.Sort2
sort.Stable2

or

sort.Sortfn
sort.Stablefn

or somehting completely different.

@jimmyfrasche

if the code generator writes a custom sorter instead of filling in the boilerplate and using the sort package the compiler can optimize it better

Indeed. But that's the whole point of go generate, you're not constrained. It doesn't make it any less of a "pro" in fact it reinforces it as a powerful approach.

They're all equivalent to "something happens and then the code is just written for you and you don't have to deal with it".

Agreed; because at the end of the day code is generated. In this respect they are equivalent.

The specific mechanism isn't the relevant part.

In the context of the problem we are trying to solve the mechanism is extremely relevant.

To my mind we are trying to solve:

_"how best to remove/alleviate the tedium associated with sorting slices for the Go community"_

I have emphasised what I consider to be the key part of this statement.

Unlike an approach that, for example, adds some magic to the reflect package where a simple code change is required to use it, using either go generate or an editor macro requires the user to do something extra (this is incidentally a "con" shared by both).

Comparing each:

  • go generate (already available as part of go tool)

    • need to go get/install the generator

    • need to add go generate directives to files

    • need to call go generate as part of my workflow

  • editor macro

    • need to find out whether my editor X supports this in the first place

    • need to install the macro/plugin that provides the macro....

    • need to understand how to use the macro etc..

    • ...

As I said before, whilst some editors may have support for macros of this sort, these macros are not universally available (i.e. in all editors) and almost by definition usage of these macros is not uniform (i.e. it's different in each editor). Which I think rules out editor macros as a viable means of solving the problem _for the Go community_, because documenting how to use the approach becomes impossible.

Does that make sense?

@rajder sort.With returning a sort.Interface is the most useful, general thing.

Since 90+% of the time it's just being passed directly to Sort a super simple helper like func SortWith(...) { Sort(With(...)) } (where ... is just me being lazy) is unobjectionable and you could do it yourself in the less common cases.

@jimmyfrasche Yes, I agree that sort.With returning a sort.Interface is more versatile. I was just thinking about the somewhat funny looking sort.Sort(sort.With..., really a minor thing imho, but still looks a bit awkward, but on the other hand quite easy to remember.

@myitcv I understand your point completely. I agree with you that go generate is a great way to solve the problem and that it should be the preferred mechanism of the community. I'm not changing the summary. The pros and cons I listed are for solving this particular problem with code generation in general and cover more than just go generate and editor macros, which are listed as examples of mechanism. Separating the points in the summary adds nothing to the discussion of _this_ topic. It just adds additional pros and cons specific to various methods of code generation, which can be discussed elsewhere.

@bradfitz I find your modified proposal very agreeable.

@bradfitz looks good, although I'd still like a sort.SortWith to go along with sort.With, as it falls more in line with sort.Search and the original proposal does define two methods, one that generates a sort.Interface and other that just sorts, but I also get @rajder's point, its a really minor thing and these days you can generate snippets for this kind of stuff in editors. Also, my opinionated implementation

@suyash, @rajder, I'm fine with a SortWith wrapper, but I don't feel strongly either way. My objection earlier was to naming anything ending in 2 or Ex or Fn (an abbreviation Go doesn't use).

Okay, check out https://golang.org/cl/27321 for an implementation of the modified proposal (https://github.com/golang/go/issues/16721#issuecomment-241085266).

It's actually faster than a normal sort, for reasons explained in the CL description.

The CL is not implementing the sort.Slice() function proposed in this issue. Are you now proposing that one should write sort.Sort(sort.With(len(s), reflect.Swapper(s), func(i, j int) bool { [...] }) instead, or you will be adding a sort.Slice() wrapper later?

@bradfitz I have not used Gerrit before but it looks to me like you forgot to commit quickSort_func and stable_func in sort.go?

Improved performance sounds awesome considering this issue had other primary goals :)

@rajder quickSort_func and stable_func are in the file src/sort/zfuncversion.go

@rasky, the comment immediately before yours does say "for an implementation of the modified proposal".

I think only the modified proposal would be palatable for everybody in this thread and the Go team, once I realized sort isn't supposed to depend on reflect, per the package layering policy described in go/build/deps_test.go.

I actually no longer like the name sort.With for the Interface constructor because it would be too easy to write & for reviewers to miss that this statement by itself:

    sort.With(len(s), reflect.Swapper(s), func(i, j int) bool { .... })

... is a no-op, even though it looks like it does something.

So I'm coming around in favor of different names and perhaps a wrapper:

  • sort.MakeInterface as proposed above to make the Interface
  • sort.With as an alias for sort.Sort(sort.MakeInterface(...))
  • no special alias for the relatively-rare sort.Stable. they can write `sort.Stable(sort.MakeInterface(...))``

This comment is off topic.

Would it not be neat to have the compiler do the optimization you did manually for quickSort_func and stable_func when the interface is not actually required to be an interface at runtime?

@rajder, let's stay on topic.

Dumb question (can't comment on CL): why does your CL
have a separate implementation instead of using funcs everywhere and
creating a new funcs iff the sort.Interface isn't already type funcs?
On Sat, Aug 20, 2016 at 1:35 PM Brad Fitzpatrick [email protected]
wrote:

@rajder https://github.com/rajder, let's stay on topic.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/golang/go/issues/16721#issuecomment-241222210, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AFnwZ5xLTp0ZKZbw5l_0T_XCLSzOcvQvks5qh2UFgaJpZM4JlCQ5
.

I would have posted these on the CL, but I gave up on trying to figure out how gerrit works. It's like an angry spaceship.

  • Shouldn't sort.Reverse have a special case if data is a *funcs, too?
  • The first panic in reflect.Swapper (line 14) refers to itself as reflect.Value.Swapper, from a previous iteration

Following up on my last comment, here's what I meant: https://go-review.googlesource.com/#/c/27456/

@jimmyfrasche, fixed the panic message. I'm not very concerned about Reverse at the moment. We can do that later easily.

@EricLagergren, commented on your CL.

Patchset 11 of https://golang.org/cl/27321 now has updated names per https://github.com/golang/go/issues/16721#issuecomment-241221703

@griesemer, @robpike, @ianlancetaylor, please review.

Could we avoid the panic in:

// If slice is not a slice, Slice panics.
func Slice(slice interface{}, less func(i, j int) bool)

by introducing a private slice.Interface interface that would be implemented by all slice types thanks to some dummy private function. The signature would become:

// s is statically typechecked to be a slice.
func Slice(s slice.Interface, less func(i, j int) bool)

@guichaz Please see the first line in the first post: "EDIT: the proposal has changed later in this thread. See #16721 (comment) and #16721 (comment)" :)

If you want typesafety, then it looks like:

    sort.With(len(s), func(i, j int) { s[i], s[j] = s[j], s[i] }, func(i, j int) bool {
        if s[i].Foo != s[j].Foo {
            return s[i].Foo < s[j].Foo
        }
        return s[i].Bar < s[j].Bar
    })

That's a strict improvement over what we have now, but still quite a mouthful, so that wouldn't stop all the complaints.

If a slice.Interface would support Len() and Swap(i, j int), then sort.Slice could be implemented without reflection or any other magic. "Just" an internal hack to slap slice.Interface over all slices.

@guichaz, language changes are out of scope and off topic for this thread, and a language change would be required to have something like your proposed slice.Interface. But your example of sort.With is indeed how you'd write a sort without any use of reflect, and without writing a new type. I agree that it's an improvement, since it keeps the sort logic in context with the code.

This might be a little off-topic, but one readability improvement might be to add one more type and method to the sort package to make it easier to compose a sort hierarchy:

package sort

type By func(i, j int) bool

// assumes !(a[i] < a[j]) && !(a[j] < a[i]) is equivalent to (a[i] == a[j])
func (a By) ThenBy(b By) By {
    return func(i, j int) bool {
        return a(i,j) || !a(j,i) && b(i,j)
    }
}

It could be used like this:

sort.With(len(s), reflect.Swapper(s), 
    sort.By(func(i, j int) bool { return s[i].Foo < s[j].Foo })
    .ThenBy(func(i, j int) bool { return s[i].Bar < s[j].Bar })
)

@infogulch I really like this idea. I think, though, that it would be a natural extension if/when the existing proposal is adopted (since it assumes that we use func(i, j int) bool). I'd suggest proposing it then. But I do really like it. It's really clean.

I'm not sure I see the "simplicity" benefit of "ThenBy"; why not simply
write the cascading comparisons into the one callback function?

BTW, in a scalar field, !(a[i] < a[j]) && !(a[j] < a[i]) is equivalent
to (a[i] == a[j]) or !(a[i] != a[j]).

On 26 August 2016 at 02:04, Joe [email protected] wrote:

This might be a little off-topic, but one readability improvement might be
to add one more type and method to the sort package to make it easier to
compose a sort hierarchy:

package sort
type By func(i, j int) bool
// assumes !(a[i] < a[j]) && !(a[j] < a[i]) is equivalent to (a[i] != a[j])func (a By) ThenBy(b By) By {
return func(i, j int) bool {
return a(i,j) || !a(j,i) && b(i,j)
}
}

It could be used like this:

sort.With(len(s), reflect.Swapper(s),
sort.By(func(i, j int) bool { return s[i].Foo < s[j].Foo })
.ThenBy(func(i, j int) bool { return s[i].Bar < s[j].Bar })
)

@praetergeek I'm not sure what your comment about boolean algebra in scalar fields is trying to convey. I understand that it's not exactly equivalent; that's why I made a comment about it. It should be true almost always, but if not for some case then it's easy to fall back to the other method.

Why? It's shorter (1 line per field vs 3n-2 lines), easy to understand (it says right there: sort by x then by y), _very_ easy to visually parse (they line up until the field name), harder to get wrong (only 2 references to each field vs 4; all the logic is encapsulated in ThenBy), and individual field comparers can be reused if needed.

But as @joshlf recommended I think I'll save it for another proposal later if this one is accepted.

I think @infogulch's idea boils down to introducing a DSL for sorting into the sort-package. I dislike the idea, because it can get out of hand quite quickly and I consider DSLs of this form a bad habit. They hide simple, readable code behind layers of abstractions that need to be unwound.

Also, it will, if used extensively, make code slower. While performance shouldn't be the ultimate goal, I think encouraging fast code by default is a good thing (and as I said, I believe it hurts readability too, though that's subjective).

@Merovius I agree with the DSL point; I hadn't thought of it that way.

Status update on this proposal:

@ianlancetaylor, @griesemer, and @robpike are all neutral. They wanted to wait for @rsc to get back from vacation and chime in.

Because the default rejection answer to standard library addition proposals is "put it elsewhere", I've preemptively put it elsewhere:

https://godoc.org/go4.org/sort

The go4.org/sort package is now a pure superset of the standard library sort package.

I figure people can get some experience with the proposal there in the meantime. The reflect-based Swapper func creator is at https://godoc.org/go4.org/reflectutil and is used by go4.org/sort.

I still think the standard library should make sorting slices easier and permit writing sort functions inline without declaring a type, though, similar to http.HandlerFunc.

Ping @rsc.

TL;DR: Can we do just sort.Slice and not all the general stuff below it?

I looked at the go4 code and the pending CL 27321. I agree that it would be good to make sort easier to use. I see a few different parts here:

  • sort.Slice, as in the go4 code.
  • sort.MakeInterface and reflect.Swapper.
  • sort.With.
  • sort.By from @infogulch above.

I would be happy to see sort.Slice in Go 1.8. It's a nice API and will be a significant improvement for the vast majority of sort.Sort uses.

The rest seem unfortunate. Specifically:

  • sort.MakeInterface can be written today, although it would execute a little less efficiently when used. It seems to exist only to implement sort.With, except that sort.With doesn't even use it.
  • Exposing reflect.Swapper may be a necessary evil. That's OK. Better to require other Go implementations to implement/port reflect.Swapper than to port unsafe code tucked into package sort.
  • sort.With can be written today (like sort.MakeInterface). As an API, it's not great because basically every call is going to use reflect.Swapper as the second argument, and it would be nice if reflect.Swapper were never mentioned again, especially not by Go users.
  • sort.By is a helper that others can provide easily enough. Unlike the others here, there's no obvious speedup behind the scenes if implemented inside package sort. Better to leave this one out for now.

Stepping back a bit, I guess the approach here is a few steps:

  1. Define a function-based sort API (sort.With)
  2. Define a converter from function API to interface API (sort.MakeInterface).
  3. Define a helper (reflect.Swapper) for sorting slices.

The alternative is one step: "add sort.Slice". Unless there is evidence that the intermediate steps have compelling uses to clean up existing code other than getting to slices, I'd rather not commit to the details of those steps in public API. sort.Slice this way ends up being a helper like sort.Ints, not a gateway to a whole parallel API.

I would also like to see sort.SliceStable added at the same time. It needs to be just as easy to do a stable sort as an unstable sort. I don't care much one way or the other about sort.SliceIsSorted but I would lean toward including it by analogy with the Ints functions.

I considered suggesting that sort.Slice return a sort.Interface, so that you'd write:

sort.Sort(sort.Slice(x, func(i, j int) bool { return x[i] < x[j] }))
sort.Stable(sort.Slice(x, func(i, j int) bool { return x[i] < x[j] }))
sort.IsSorted(sort.Slice(x, func(i, j int) bool { return x[i] < x[j] }))

but I think that's overkill. sort.Slice is a helper, so it should be helpful. The fact that the comparison takes indexes instead of values means that in basically every call there's going to be a closure, so we should avoid piling on too much other stutter. These are clearer:

sort.Slice(x, func(i, j int) bool { return x[i] < x[j] })
sort.SliceStable(x, func(i, j int) bool { return x[i] < x[j] })
sort.SliceIsSorted(x, func(i, j int) bool { return x[i] < x[j] }))

Just a quick note. Isn't sort.IsSliceSorted better? Or is it more valuable to have the same prefix for all these related methods?

@mibk From a clean slate, maybe, but we already have IntsAreSorted, Float64sAreSorted, StringsAreSorted, not AreIntsSorted etc.

Exposing reflect.Swapper may be a necessary evil. That's OK. Better to require other Go implementations to implement/port reflect.Swapper than to port unsafe code tucked into package sort.

So you're cool with 4 API additions: reflect.Swapper, sort.Slice{,Stable,IsSorted}?

So you're cool with 4 API additions: reflect.Swapper, sort.Slice{,Stable,IsSorted}?

SGTM

CL https://golang.org/cl/30088 mentions this issue.

@rsc considering that external packages may accept sort.Interface values, such as container/heap and presumably a subset of https://godoc.org/sort?importers (including a few of my own packages), I'd like to counter-argue that having no mechanism for creating a sort.Interface will be surprising and disappointing.

Even in the absence of such concerns, I'd like to argue that the proliferation of parallel functions (SliceStable) to be unfortunate -- I appreciate sort.StringSlice and friends, but see no justifiable benefit to sort.Strings and friends. Such functions bloat the package, saving users only a handful of keystrokes.

I feel that this may just boil down to a naming problem -- with a clearer name than MakeInterface, the usage of an interface builder may well become obvious.

sorry for bringing this up if this was resolved somewhere else, but any particular reason the sort.Search like syntax was not preferred?

I think that the fundamental argument of making reflect.Swapper function public is to allow users the option of using it or not, otherwise it can just be implemented inside the sort package itself.

another point is that getting the length of slice directly would _probably_ be faster than getting it through reflection, though not sure by how much.

also, all the previous discussion about not encouraging generic programming through interface{}

@extemporalgenome It's trivial to write a converter from function triple to interface if someone needs that. That's true already today.

type funcs struct { n int; swap func(i, j int); less func(i, j int) bool }
func (x funcs) Len() int { return x.n }
func (x funcs) Swap(i, j int) { x.swap(i, j) }
func (x funcs) Less(i, j int) bool { return x.less(i, j) }
func FuncInterface(n int, swap func(i, j int), less func(i, j int) bool) Interface {
    return &funcs{n, swap, less}
}

If there is evidence that this comes up often in real programs, we can easily add it later. But we'd need to see that evidence first.

@suyash I assume that by sort.Search-like syntax you mean code like

sort.With(len(s), func(i, j int) { s[i], s[j] = s[j], s[i] }, func(i, j int) bool {
    if s[i].Foo != s[j].Foo {
        return s[i].Foo < s[j].Foo
    }
    return s[i].Bar < s[j].Bar
})

as in the comment above (Aug 21). The reason not to prefer this is that this entire proposal is about making sort easier to use. The above is not substantially less code than you have to write today. It does avoid the new type, but it is still full of boilerplate.

Compare to:

sort.Slice(s, func(i, j int) bool {
    if s[i].Foo != s[j].Foo {
        return s[i].Foo < s[j].Foo
    }
    return s[i].Bar < s[j].Bar
})

The new API is meant to be a helper, so it should be as helpful as possible - with as little boilerplate as possible - without restricting the scope of its help. My guess is that >99% of sorting happens on slices, in which case the generality added by the boilerplate serves no useful purpose. If someone has evidence otherwise, I'd be happy to see it.

CL https://golang.org/cl/30250 mentions this issue.

@rsc lets take an example that comes up fairly often (for me), lets say you want to sort a set of numbers but you also want to track the original positions of the numbers.

So you had a slice like []int{5, 2, 3, 1, 4} and you sorted it to get []int{1, 2, 3, 4, 5} but at the same time, you also wanted to track positions, so in a separate slice you would want []int{3, 1, 2, 4, 0}.

With this addition, you would separately declare a pair type and a new pair array and copy original array's contents into the new array and initialize positions and then sort them, while defining the appropriate less closure.Something like

a := []int{...}

type pair struct {
    first, second int
}

p := make([]pair, len(a))
for i := 0 ; i < len(a) ; i++ {
    p[i] = pair{a[i], i}
}

sort.Slice(p, func (i, j int) bool {
    return p[i].first < p[j].first || (p[i].first == p[j].first && p[i].second < p[j].second)
})

while if we had the option to define a swapper, we can just define another slice, initialize that to positions (0, 1, 2...) and in the swap, swap in both slices.

something like

a := []int{...}

pos := make([]int, len(a))
for i := 0 ; i < len(a) ; i++ {
    pos[i] = i
}

sort.Slice(len(a), func(i, j int) {
    a[i], a[j] = a[j], a[i]
    pos[i], pos[j] = pos[j], pos[i]
}, func(i, j int) bool {
    return a[i] < a[j] || (a[i] == a[j] && pos[i] < pos[j])
})
  1. The first reason I am advocating the use of sort.Search like semantics is because sort.Search itself is defined only for slices, but the best part about it is that it can do 6 different operations simply by passing the appropriate closure (https://go-review.googlesource.com/#/c/29131/6/src/sort/example_search_test.go@58). One issue that comes up with doing things differently with sort.Slice is the difference from sort.Search in terms of semantics.
  2. More convenience (for me) means more flexibility, then less code.
  3. Again, the reason reflect.Swapper came up in the first place was to support the general case, so you could do

    sort.Slice(len(a), reflect.Swapper(a), func(i, j int) { ... })
    

    otherwise I don't think there was any need to have it as a public library function.

  4. another thing that came up in this discussion was the facilities provided by modern text editors
    w.r.t boilerplate code. editors these days have macros and snippets, so you can configure them to
    expand things, like you can define a swapper snippet that on a tabstop expands to func(i, j int) { a[i], a[j] = a[j], a[i] }
  5. @bradfitz raised a point about purity of the function irrespective of type errors (https://github.com/golang/go/issues/16721#issuecomment-241088463)
  6. Again, and I don't know how much impact it will have, but getting the slice length directly will probably be faster than getting it through reflection
  7. You might think that something like sort.MakeInterface solves this, but @bradfitz himself did genzfunc.go for performance, the results of which are in the commit message at https://go-review.googlesource.com/#/c/27321/11. Also in case you didn't see this: https://github.com/nieksand/sortgenerics https://news.ycombinator.com/item?id=12602456

@suyash For your example you could use rsc's FuncInterface function above. It's marginally more typing for you, and no additional API in the standard library. To get this into the standard library there needs to be clear and common use cases.

@suyash What Ian said, but also "text editors can macro-generate your code for you" is a particularly bad rationale for an API decision. Code has to be read and maintained, not just written. Also, if you are OK with text editor macros, one could generate the old, more flexible sort.Interface type+methods for you if you insist.

The time it takes to compute the slice length is not going to show up.

Was this page helpful?
0 / 5 - 0 ratings