This is a tracking issue for the RFC "Add is_sorted to the standard library" (rust-lang/rfcs#2351).
Steps:
Unresolved questions:
Iterator::is_sorted_by_key be added as well?](https://github.com/rust-lang/rfcs/blob/master/text/2351-is-sorted.md#should-iteratoris_sorted_by_key-be-added-as-well)std::cmp::is_sorted instead?](https://github.com/rust-lang/rfcs/blob/master/text/2351-is-sorted.md#add-stdcmpis_sorted-instead)Ord instead of only PartialOrd?](https://github.com/rust-lang/rfcs/blob/master/text/2351-is-sorted.md#require-ord-instead-of-only-partialord)@gnzlbg Would you be interested in implementing this? You basically already wrote the code. I think it would be fancy to have this SIMD code in the standard library. (Or should we start with only the fully generic algorithm?). PS: Sorry for not having reviewed https://github.com/rust-lang/rust/pull/53386 yet -- I hope I'll find the time and mental capacity to do that in the near future!
@Centril The links for the unresolved questions are dead (still pointing to the 0000 file). Could you or anyone update them? :)
Regarding the "Should Iterator::is_sorted_by_key be added as well?" question: this last minute comment in the RFC thread argues that this method should be added:
I feel like it is not a question about how easy it is to do it yourself using map but more about completeness of the api.
And people seem to agree. So I'd also say that this method should be added.
@LukasKalbertodt done :)
cc @SimonSapin re. the unresolved questions (particularly around is_sorted_by_key).
I don't know when I can work on this, but I can mentor anyone wanting to implement this or wanting to port the implementation of the is_sorted crate.
Oh, I didn’t realized that Iterator::is_sorted_by_key was not part of the RFC when I proposed FCP. I thought there was consensus to include it for consistence, both internal and with the existing sort_by_key/min_by_key/max_by_key methods which are also redundant with their respective method like sort_by.
We do have precedent of adding some APIs that can be expressed as a terse combination of existing APIs: https://github.com/rust-lang/rust/pull/49098
@clarcharr, you expressed the opposite in https://github.com/rust-lang/rfcs/pull/2351#issuecomment-372083545. Do you still think it should not be included?
At the time I made that comment, I didn't realise that max_by_key and min_by_key were already stable (as of 1.6, which is a while ago). Basically, there are two options:
max_by_key and min_by_key and don't add is_sorted_by_key.is_sorted_by_key.I feel like the usefulness of these should be re-evaluated, as it forces the key to outlive any reference to self rather than just the current reference to self. Basically, we'd need something similar to GAT for it to be useful, and map already exists.
Personally, I'd prefer option 1. But option 2 is also acceptable and there may be a way to make these functions work with GAT in the future.
@SimonSapin would you like to be the reviewer of a PR implementing this ? If so, I'd like to discuss with somebody first how the is_sorted crate is structure, and how to best implement this functionality in the std library.
So I don't know how to properly implement this in core.
The is_sorted functionality should use intrinsics when available at run-time. The iter and slice modules in core do not have access to run-time detection (that requires std), but e.g. the Iterator trait exported by core and std has to be the same, so we can't override the core implementation in std to add run-time detection on top.
After talking with @SimonSapin on IRC, I am just going to add the naive versions to core which get picked up by std and call it a day. Those who need more performance can just use is_sorted from crates.io. Adding vectorized versions of the is_sorted functionality isn't worth it either, because one would need to re-compile core to profit from it (using crates.io is just easier). Maybe if we get newer targets in the future with more features enabled this might become worth it.
I have interest in working on this RFC. This would be a significant learning experience for me, but I don't see any reason why I should not be able to complete it.
@gnzlbg If your offer for being available as a mentor is still open, I'd like to take you up on that!
If there is any concern in timeline (I should be able to begin implementing this week) or preference for someone who may have more experience, I have no problem with that; just let me know!
An efficient implementation of this is probably blocked on https://github.com/rust-lang/rfcs/pull/2492 (e.g. see https://github.com/rust-lang/rfcs/pull/2492#issuecomment-422783212) , since until then we cannot really use SIMD in libcore effectively.
My recommendation is that if you want to implement this now, you should just adapt https://github.com/bluss/rust-itertools/pull/261 which is a straight-forward plain Rust implementation. Maybe with some tweaks one can get it to auto-vectorize in some cases. This would allow us to get experience with the API until the other problems are solved.
Thanks for the links; it makes sense that a more efficient implementation is blocked on that right now.
To clarify on my initial comment (and what you picked up on in your last sentence), is the "learning" part of this would be more related to familiarizing myself with the API and codebase.
I think an initial implementation based off https://github.com/bluss/rust-itertools/pull/261 could be a good first step and we'll keep an eye on the progress of https://github.com/rust-lang/rfcs/pull/2492?
@gnzlbg You'll find my initial implementation in the PR above (https://github.com/rust-lang/rust/pull/55045). I'm not sure who should be officially reviewing this, but I can update the PR if need be!
@SimonSapin It was asked a few comments up if you'd like to be the reviewer of this PR? I still need to assign someone to review it, but I'm just not sure who.
This has broken the is_sorted crate on nightly. Is the only workaround to turn the feature flag on or use an older nightly?
--> ~/.cargo/registry/src/github.com-1ecc6299db9ec823/is_sorted-0.1.0/src/lib.rs:119:31
|
119 | self.map(|v| key(&v)).is_sorted()
| ^^^^^^^^^
|
= help: add #![feature(is_sorted)] to the crate attributes to enable
Call IsSorted::is_sorted directly.
@clarcharr The above error happens when adding is_sorted as a dependency (without using it). The is_sorted crate itself no longer builds.
You can send a PR to the is_sorted crate and use the patch section in your Cargo.toml in the meantime until it is merged and published.
@mtak- thank you for the PR, i've merged it already, will fix the doc tests and do a release so that the world keeps going.
The collision between the is_sorted crate and this PR was expected, I guess I could have fixed that a priori. Sorry about that. Keep in mind that if you are using the is_sorted crate, if you don't disambiguate your code the implementations of libcore will be picked after a re-compile.
@gnzlbg No worries. Thanks for responding so quickly!
As I accidentally commented on the implementation PR thread instead of here:
Iterator::is_sorted_by_key in terms of map and is_sorted rather than delegating it to is_sorted_by? It seems to me that f is currently called more often than necessary.Iterator::is_sorted_by_key's F the bound FnMut(Self::Item) -> K instead of FnMut(&Self::Item) -> K? Unlike max_by_key/min_by_key, is_sorted_by_key doesn't need to retain any of the iterator's values, so we probably don't need to pass them by reference.Something else to ponder: the closure parameter of is_sorted_by has the return type Option<Ordering>, which is a little awkward – it seems only useful for when you want to call PartialOrd::partial_cmp from it, but it's doubtful how common that will be given that we also have is_sorted and is_sorted_by_key.
Let's look at how the return value of the closure affects the outcome of is_sorted_by:
Some(Ordering::Less) or Some(Ordering::Equal), the iterator is still sorted up to that point and iteration continues.Some(Ordering::Greater) or None, the iterator is not sorted and the method returns false.So either the two given items are in sorted order, or they're not. Shouldn't it just return a bool? It seems like it would make some common uses much simpler: checking that an iterator is in, say, strictly descending order would just be a matter of calling .is_sorted_by(PartialOrd::gt). In general, any previous instances of a.cmp(b) return values in that closure could be replaced with a <= b.
The only downside I see is that it might make this method _too_ useful – it could be used to check conditions that don't have anything to do with sorting like checking that all items are equal, or that no two consecutive items are equal, or that the difference between any two consecutive items doesn't exceed some limit, etc.
The only downside I see is that it might make this method _too_ useful – it could be used to check conditions that don't have anything to do with sorting like [...]
You can already kludge things like that with Ordering if you really want. I think that's mostly OK, but perhaps there would be room for a more generic name like compare_by, or maybe all/any analogues on &Item pairs. See also is_partitioned() that I've proposed in #62278.
You can already kludge things like that with Ordering if you really want.
Good point!
perhaps there would be room for a more generic name
I think the more generic way to do this would be to create an iterator over pairs of consecutive elements and then call all on that. We would still probably want an is_sorted_by method, though, and I think it would still benefit from having its closure parameter return a bool.
Require Ord instead of only PartialOrd?
the closure parameter of is_sorted_by has the return type Option
, which is a little awkward
Personally, I was expecting an Ord bound on these methods rather than PartialOrd and for is_sorted_by to return Ordering rather than Option<Ordering>. That doesn't help folks wanting to sort floats much, but leaves them in at least a consistently incompatible place with respect to the existing sorting methods.
I recently needed to check a slice for sortedness and found this function pretty useless because saying "not sorted" is usually not enough, you need to return a position at which the collection is not sorted to do anything useful.
In other words, if this function needs to exist in the standard library, it should probably exist as a convenience on top of some other function, like contains is a convenience over get.
@petrochenkov I guess to align with .binary_search() in that case those underlying methods could return a Result<(), usize>, with is_sorted effectively becoming a call to .sorting().is_ok()?
I’m wondering whether there’s a return for the sorting methods returning an index that’s intuitive enough to make the ones returning a bool unnecessary.
With #72568 I think an Ord bound is quite compelling.
With #72568 I think an
Ordbound is quite compelling.
I agree, we should just stabilize it with Ord (as soon as all other open questions are resolved). Then we can also change the Option<Ordering> to just Ordering
I recently needed to check a slice for sortedness and found this function pretty useless because saying "not sorted" is usually not enough, you need to return a position at which the collection is not sorted to do anything useful.
Can you share more details about your use case? The RFC lists unit tests and "asserts" as the main reason why is_sorted would be useful. If you have another use case, that would be interesting to know. However, I would tend to keep these methods simple instead of making them general enough for every possible use case.
Most helpful comment
As I accidentally commented on the implementation PR thread instead of here:
Iterator::is_sorted_by_keyin terms ofmapandis_sortedrather than delegating it tois_sorted_by? It seems to me thatfis currently called more often than necessary.Iterator::is_sorted_by_key'sFthe boundFnMut(Self::Item) -> Kinstead ofFnMut(&Self::Item) -> K? Unlikemax_by_key/min_by_key,is_sorted_by_keydoesn't need to retain any of the iterator's values, so we probably don't need to pass them by reference.