Just did a little string-based challenge to train my Rust and stumbled, when I wanted to chain several calls, but could not do so, because std::vec::Vec's sort_by() returns () instead of Self. Imho, this is a very small change which would make life easier and code cleaner. Example:
Instead of this
let blob = "cwrxwzbgickpjbp";
let mut result = blob
.chars()
.fold(vec![(None, 0); 26], |mut counter, c| {
// ...
counter
});
// cannot be chained, because it returns ()
result.sort_by(|a,b| b.1.cmp(&a.1));
let result = result
.iter()
. // and so on
I would love to write this:
let blob = "cwrxwzbgickpjbp";
let result = blob
.chars()
.fold(vec![(None, 0); 26], |mut counter, c| {
// ...
counter
})
.sort_by(|a,b| b.1.cmp(&a.1))
.iter()
. // and so on
All sort methods have the same problem, so I would propose to change them so they return &mut Self and can be chained.
That would prevent in-place sorting as it requires allocation of a new Vec<T>
sort methods should return Self
Did you mean &mut Self?
@sinkuu ah, yes, thank you!
This is a breaking change.
I think sort is different enough from iterators, that is should not be chainable. We don't allow chaining on push, extend, ... either.
Your initial example makes it very clear that there's an intermediate allocation, and not a zero cost iteration.
In Python the functions that work in-place, like sort() return None, this helps remember their semantics. In Python you also have sorted() that clones the input sequence into a new list, sorts it, and returns it. A handy sorted() is also present in the Rust itertools crate (but it currently lacks the key, and unstable versions):
https://docs.rs/itertools/0.7.2/itertools/fn.sorted.html
D language standard library solves this ergonomy/correctness problem of sort in a different way:
https://dlang.org/phobos/std_algorithm_sorting.html#.sort
In D sort() doesn't return void, nor it returns the sorted array. It returns a SortedRange, and you can get the sorted array calling a release() method on it. This makes sort() usable in range chains, where it doesn't break the chain (unlike Rust sort functions), but at the same time the ".release()" after the sort makes it clear that it not a regular function and works in-place. A SortedRange is also useful for many other purposes (it supports binary searches and other similar searches, a filter() on a SortedRange could return another SortedRange, etc).
This is not actionable due to the breaking change so I'll close this.
What about a method sorted which returns the sorted vector?
What about a method
sortedwhich returns the sorted vector?
+1, this is not breaking and would be useful
Is this sorted method still under consideration? does anyone has a good generic implementation?
An extension trait can easily be written:
trait VecExt {
fn sorted(self) -> Self;
}
impl<T> VecExt for Vec<T>
where
T: std::cmp::Ord,
{
fn sorted(mut self) -> Self {
self.sort();
self
}
}
fn main() {
let blob = "cwrxwzbgickpjbp";
let result = blob
.chars()
.collect::<Vec<_>>()
.sorted()
.into_iter()
.count();
}
Rinse and repeat for each variation of sorting.
Or make it extremely generic:
trait Tap {
fn tap(self, f: impl FnMut(&mut Self)) -> Self;
}
impl<T> Tap for T {
fn tap(mut self, mut f: impl FnMut(&mut Self)) -> Self {
f(&mut self);
self
}
}
fn main() {
let blob = "cwrxwzbgickpjbp";
let result = blob
.chars()
.collect::<Vec<_>>()
.tap(|v| v.sort())
.into_iter()
.count();
}
For what it's worth, Swift has a dedicated verbiage for this throughout the language: methods ending in -ed() _return_ a new object with that property (e.g., sorted()) while ones without that modify self (e.g., sort()).
I think having this in Rust would add a nice consistency if followed in other parts of the language! (I'm new to Rust so I'm not sure if other parts of the language already have this sort of convention.)
Edit: #2731 appears to address this
Most helpful comment
What about a method
sortedwhich returns the sorted vector?