Update (@SimonSapin): this is now the tracking issue for an iterator adaptor based on Iterator::nth
:
pub trait Iterator {
fn step_by(self, step: usize) -> StepBy<Self>
}
The Step
trait used for making ranges iterators is now tracked at https://github.com/rust-lang/rust/issues/42168.
Original issue:
The step_by
method makes it possible to step through ranges with arbitrary-sized steps. There are several issues that need to be addressed before we can stabilize it:
Step
trait, which is currently a bit of a messZero
/One
, tracked hereNominating for P-High.
triage: P-high
We should probably start moving on this in the next cycle or drop the P-high.
Would definitely like some help getting this over the finish line. I'm happy to mentor!
The library subteam decided to not move this to FCP this cycle. The API is pretty likely to need revision in one form or another and that should be taken care of before a FCP happens. I believe discussion is still going on in the internals post though!
Also moving to P-medium
It seems odd to me that the by
type is Self
. It seems similar to the Add
trait where I can choose the RHS and the sum type separately. It seems like the type of the steps should similarly be potentially different from the elements themselves (steps are effectively differences).
Where do we stand on this at the moment? This is a feature I'd very much like to use in stable, and am willing to provide help to get this finalized (with some mentoring).
Based on the documentation, I would expect (0u8..).step_by(1).count()
to be 256, but currently it diverges. Is this a bug in the documentation or in the implementation?
Why do you expect that? Overflowing the range of an integer type is an error (it panics in debug mode), it’s not the expected end of a RangeFrom
iterator (which doesn’t have an end).
@SimonSapin The example for RangeFrom::step_by
is
for i in (0u8..).step_by(2) {
println!("{}", i);
}
I think that this example is misleading: its explanation says "This prints all even u8 values.", but instead it overflows on panic in debug and loops forever in release.
Indeed, that example is wrong. How does this look? https://github.com/rust-lang/rust/pull/31172
Ditto @dschatzberg's comment. Imagine something like stepping through a range of two Instant
s by a Duration
.
A sequence of floats like 0.1, 0.2, ..., 0.5 is so common in numerical codes that Matlab has a built-in operator for it:
But Step
is only implemented for integer types, perhaps in the fear of a numerical error around the endpoint.
I understand your rigorous approach but it would be handy to have step_by()
for floats (perhaps as an optional trait?)
Also, I think we should consider inclusion of Itertools::step(usize)
into the stdlib as well (thanks u/levansfg on Reddit for letting me know)
which, to my eyes, is more generic (applicable to all iterators)
IMHO, step_by should be a general abstraction for all iterators as mentioned by @nodakai and the current implementation that uses addition must be at most a specialization for efficiency sake for integral types (now that Rust has specialization implemented).
yigal100's approach seems sensible, in which case step_by
should take a usize
regardless of the iterator type and just call next
a configurable number of times (>= 0).
Another option would be to drop step_by
and instead allow a Range
to be multiplied by something; e.g. (1..3) * -2
would return -2, -4. In this case it would be nice to be able to write (0..10) * 0.1
, however, where would the (int->float) conversions come in? Also, it might be nice to be able to use other operators: (0..101) / 100.0 + 1.0
for 101 values from 1 to 2.
Neither of these options requires a One
trait really; it depends how Rust converts from the a..b
syntax to an iterator (if this is restricted to integer types, as seems to be the case, then 1 as TYPE
is sufficient).
The questions which I think need answering are:
In my mind these are:
a. producing a sequential list of array indices (i.e. non-negative integers), usually stepping by +1 but possibly also -1, 2, etc., and
b. producing an evenly spaced set of sample points, usually floating-point and with any spacing. This doesn't _need_ to be accommodated by the standard library.
c. (possibly) skipping some items in a general iterator (Itertools::step(usize)
that nodaki mentioned)
As far as I am aware, the a..b
syntax must be kept for integers and must be usable as an iterator.
Is there a fundamental reason that -20..20
or 0.5..2.5
or "a".."z"
cannot be implemented? Is there already a commitment to enable (0..8).step_by(2)
in this form?
Which is clearer, (0..100).step_by(10)
or (0..10) * 10
? ` Or some other way?
Is there any point allowing negative steps? Maybe not for indices since .reverse()
is clearer but possibly for a range of sample points (e.g. (0..11) / -100
to yield 0, -100, -200, ..., -1000
).
One thing I like about the RANGE * SCALAR
syntax is that it might enable type-conversions to floats or user-defined types without requiring support for those types in the a..b
syntax.
Note that map
allows some of the above already:
fn main() {
let v: Vec<_> = (0..5).map(|x| x*2).collect();
println!("list: {:?}", v);
for v in (0..10).map(|x| x as f32 / 10.0) {
println!("v: {}", v);
}
}
Other Iterator
methods like rev()
can also be used. The inclusion of step
from itertools should be enough to complete the picture here (i.e. drop the existing step_by
method from Range
).
Is it possible to optimise these down to the same thing?
let v1: Vec<_> = (0..10).step_by(2).collect();
let v2: Vec<_> = (0..5).map(|x| x*2).collect();
The use of the One
trait by Range
is another issue entirely: it's what allows a Range
to be converted to an Iterator
. I suppose the idea was that step_by
would provide a different way of providing this "step".
It's ironic that Range
depends on a trait called One
to do this since 1f32
is just as much "one" as 1u32
is. It's also surprising that for x in (0f32..10f32).step_by(1f32) { ... }
doesn't work given the API docs.
Is there any good reason not to drop step_by
entirely?
Is it possible to make step_by
work for floats (by using x < end
comparison) without impairing the integer implementation?
Can the Iterator
impl for Range
use a new trait, e.g. core::ops::StepByOne
, so the Zero
and One
traits are no longer required to meet this odd definition of "one"?
I'm kind of glad to see someone has brought up the One
trait bit. I always did find it a bit unusual as the stated purpose of One
(from its own documentation) is to be a multiplicative identity.
Recently I played around with Haskell, which on first impression appeared to have a really nice and clean solution by using a generic "successor" (add 1) method to provide ranges... however, upon further thought, such a pure and simple design does not enable the use of larger step sizes efficiently. Sure enough, it turns out that what Haskell actually requires from a range-compatible type is not the successor method, but rather _conversion methods to/from an integer_. In light of that, the One
requirement doesn't seem too bad!
Is there a fundamental reason that -20..20 or 0.5..2.5 or "a".."z" cannot be implemented? Is there already a commitment to enable (0..8).step_by(2) in this form?
On this, I'd like to point out that non-integral candidates for a range type (such as characters or floating points) go hand-in-hand with inclusive-range syntax; The main benefits of semi-inclusive ranges come from looking at integers as "insertion indices" pointing between elements. Once your range elements are actual values, this notion begins to fail and you end up with let lowercase = vec!["a".."{"];
This is just speculation, but I rather suspect that support for expressions like ["a".."z"]
were the primary motivating reason that some languages like Perl and Haskell use inclusive ranges despite having zero-based indexing -- design decisions which would appear to me to be fiercely at odds with each other!
Speaking for myself, the current form of step_by
already suffices for any use case I've ever had, and I'm here today because I was surprised and disappointed to find that this feature still hasn't landed in stable.
Edit: Urk, okay. Iterator::step
with a specialization for Range
does sound alluring. Though I also fear it may have potential performance pitfalls unless we can "forward" these optimizations via e.g. a RandomAccessIterator
trait
Next step is to clean up the Step
trait into something reasonable for stabilization.
I continue to be happy to mentor!
Discussed at libs triage the decision was to _not_ move into final comment period, but as @brson mentioned the next step here is cleaning up the Step
trait which @aturon is willing to help with!
I'm willing to help here. What tasks in particular need to be done with regard to cleaning up the Step
trait?
@shssoichiro Thanks for stepping up (ahem)! So the basic problem is that, especially with the recent deprecation of Zero
and One
, the Step trait has gotten pretty out of hand. What we need is a trait that can support stepping forward and backward with arbitrary strides that is as simple/elegant as possible.
To get started, you can have a look at the code in https://github.com/rust-lang/rust/blob/master/src/libcore/iter/range.rs, most of which is devoted to the various step-by iterators. Challenge: figure out how to achieve the same goals, but with a simpler Step
trait.
Separately, there are also some questions about what we should do with negative strides; see this thread for more. We'll need to come to consensus around a reasonable design there to go forward.
Hopefully that's enough to get started, but let me know if you've got questions! You can reach me on this thread or at [email protected]
@shssoichiro Had a chance to look at the Step
trait yet?
@brson I have, although not as much as I'd like. I think the ideal path would be to remove all of the methods from the Step
trait except for step
and steps_between
. However, I'm having trouble finding a practical way of implementing this.
Thanks for the update!
My humble opinion on this.
I think the difficulty arises from the fact that 2 different problems are being tackled at the same time:
1- mathematical ranges
2- generic iterator steps
What about extending the current Range
structs (or implementing new ones) with a step
parameter to solve the first problem, and going on with the Step
trait for iterators?
This would allow to implement for i in (0.5 .. 15.7) * 0.2
(or any other syntax) using any (ordered) mathematical number type, and for something in some_vec_of_strings.step_by(2)
with a normal usize for the iterators.
I made a quick example based on the @Alex-PK's suggestion here.
pub struct ExclusiveRange<T, S> {
start: T,
stop: T,
step: S,
}
pub struct InclusiveRange<T, S> {
start: T,
stop: T,
step: S,
}
pub struct UnboundedRange<T, S> {
start: T,
step: S,
}
It works well for integers, but right away I wrote a failing test for floats that I'm not sure how to get around.
#[test]
fn test_excl_float_range() {
let mut iter: ExclusiveRange<f32, f32> = ExclusiveRange {
start: 0.0,
stop: 1.0,
step: 0.3,
};
assert_eq!(iter.next(), Some(0.0));
assert_eq!(iter.next(), Some(0.3));
assert_eq!(iter.next(), Some(0.6));
assert_eq!(iter.next(), Some(0.9));
assert_eq!(iter.next(), None);
}
This fails with
thread 'test::test_excl_float_range' panicked at 'assertion failed: `(left == right)` (left: `Some(0.90000004)`, right: `Some(0.9)`)', src/lib.rs:122
How would we get around floating point addition errors? Also, this obviously doesn't address the issue of negative step sizes or going backwards like (10..1)
.
@jbaum98 That's not a problem with Range, it's a problem with floats. Because many operations can lose precision, you should almost never assert that two floats are exactly equal.
Unfortunately, it doesn't look like Rust's std
has an assert_almost_eq!()
So it would be acceptable that the iterator produced 0.90000004 instead of 0.9?
It can't produce exactly 0.9 at all, ever - this value is not perfectly representable in floating point, nor is your 0.3 step.
Okay that makes sense. So it seems like the example works for the basic behavior of expanding ranges beyond integers, allowing inclusive and exclusive ranges, and allowing step_by as had been discussed so far.
Please just stabilize this. I don't care if it's a mess. It's basic functionality.
Does this adapter allows a range of f64
to be iterated in float steps?
I think that adopting something like itertools' step
into the Iterator
trait is best, because then you could do something like
my_vec.iter_mut().step_by(2).map(|x| x * x);
Basically keep it generic for everything that can be iterated.
On the topic of floating values, I think that the biggest problem is that floats don't have a well defined next value/successor. In that case, then step_by
should only work for things that implement a trait that says they have a successor. I don't know how that would work, but it's an idea.
That way step_by
won't be implemented for float-related stuff unless we define a successor
function/trait that works for floats. And I think that it's not a huge problem if we keep floats out of it, because in the case that someone needs to use something that steps by a float, they'll probably be using some kind of numeric crate already, so the crate itself could implement that in the way that it works best for it.
I think step_by
should just take a usize, like skip
and take
, and just call next
on the wrapped element n times n each next
called on the StepBy iterator (see my other comment from december)
Trying to solve both the mathematical and the stepping problem with the same solution is what keep this discussion running for nearly 2 years without a solution in sight and a basic functionality missing from the language.
@Alex-PK I believe that instead of the step_by
method on Range
that should be a method of iter
(every_nth
?), but I agree that it would be very useful and that it would cover most of the cases in which right now one would use step_by
.
Note that an efficient implementation could use nth
with the step instead of calling next
.
Should we also keep the name step_by
or change it to every_nth
for consistency with nth
?
The suggestion by @Alex-PK when combined with the optimization suggested by @ranma42 seems like it would nicely thread the needle of usable yet easy to specify and stabilize here. I'd personally like the idea of adding such an adapter and deprecating/deleting the current step_by
.
Would anyone be interested in sending a PR with an implementation of this? The name every_nth
seems reasonable to me and could live alongside step_by
for now and hopefully get stabilized relatively quickly.
FWIW, itertools has the general method as simply step
.
Oh neat! The implementation there looks perfect. I'm not 100% sure we'll be able to use the same name, though, as it'd clash with itertools and likely cause a lot of real breakage in the ecosystem.
Oh, I'm up for it. Just downloaded the rust repo, and I'll try working on it today.
I think I'm gonna make a EveryNth<I>
struct for the iterator adaptor and a every_nth
for the method.
Awesome thanks @ivandardi! Lemme know if you need any help.
Yeah, some design issues came up.
We have a few different options for this:
1. Where to start
1.1
let mut it = (0..).every_nth(1).take(3);
assert_eq!(it.next(), Some(0));
assert_eq!(it.next(), Some(2));
assert_eq!(it.next(), Some(4));
assert_eq!(it.next(), None);
1.2
let mut it = (0..).every_nth(1).take(3);
assert_eq!(it.next(), Some(1));
assert_eq!(it.next(), Some(3));
assert_eq!(it.next(), Some(5));
assert_eq!(it.next(), None);
2. Behaviour of every_nth(0)
2.1
let mut it = (0..).every_nth(0).take(3);
PANIC
2.2
let mut it = (0..).every_nth(0).take(3);
assert_eq!(it.next(), Some(0));
assert_eq!(it.next(), Some(1));
assert_eq!(it.next(), Some(2));
assert_eq!(it.next(), None);
2.3
let mut it = (0..).every_nth(0).take(3);
assert_eq!(it.next(), Some(0));
assert_eq!(it.next(), Some(0));
assert_eq!(it.next(), Some(0));
assert_eq!(it.next(), None);
Okay, @ivandardi and me are discussing this over at Discord currently and it seems like it's not even clear whether .every_nth(2)
should skip 2 elements in between or if it should move the iterator by 2 elements ahead. The former assumption is made in 1.1 and 1.2 here, while the latter is made in 2.1, 2.2 and 2.3.
We're discussing whether the sequence 0, 2, 4
should be produced by (0..).every_nth(1).take(3)
or (0..).every_nth(2).take(3)
. The first one has the semantics of "skip n", while the second is "go every nth element".
In case the first one is better, then it would be a good idea to rename every_nth
to something that better expresses what's happening, while if the second one is better, then we need to decide what happens with every_nth(0)
.
I think if it's called every_nth
it should be the same as if you called nth
in a loop.
That would mean that for item in list.iter().every_nth(4)
would iterate over every 5th item, which is extremely surprising behavior. Since this is supposed to be solving the step_by
issue, which is something you would use in for loops that step by a certain value every time, you wouldn't expect "every 4th" element to yield every 5th element.
I think that's an argument for calling this something else, because the relationship to nth
is messy.
(primarily because nth
is 0-indexed, which doesn't match most people's intuition. e.g. nth(2)
returns the third item.)
Let me see...
step
: We could call it just like the itertools crate method.
step_by
: We could use the same name of the method currently in the Step
trait and just remove the Step
trait altogether.
skip_every
: Longer name, but more descriptive.
skip_by
: Variation of the last one.
???
Below is what each of those names sounds to _me_ (reference iterator: [1, 2, 3, ...]
:
.every_nth(n)
, I would expect [n, 2n, ...]
.skip_by(n)
, I would expect [1, n+2, 2n+3, ...]
(skips n
items in-between)..step_by(n)
, I would expect [1, n+1, 2n+1, ...]
(takes n
steps)..skip_every(n)
sounds ambiguous to me. Is it only going to skip every n
th item?.take_every(n)
sounds ambiguous too. I _might_ expect it to be the same as every_nth
._EDIT:_ Clarify
Ah ok yeah the confusion with the name every_nth
makes sense to me. At this point I think I like the idea of keeping the name step_by
perhaps? It's an unstable API so it's not entirely unreasonable that we'd break some usage of it.
Couldn't we completely remove the Step trait too? I don't think it's something that will be used anytime soon, and the sooner we remove/replace it, the better. Then we could use the step_by
name and it should still work with any (0..10).step_by()
codes out there.
I was thinking, if Iterator
had another method named next_nth
which simply skips to the "next nth" item, by calling next
, n
times by default; then types can simply have their own more efficient implementation when possible. This might make it easier to implement step_by
? (Was this already considered?)
@ivandardi yeah if we leverage Iterator::nth
I think ideally we'd remove the Step
trait
@johncf are you thinking of Iterator::nth
?
Since the Step
trait already has the step
method, I thought I'd look into its implementation.
For (0..10).step(0)
, according to the current implementation, it just enters an infinite loop yielding 0
s. We'd like to avoid that.
For (0..10).step(1)
it yields 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
.
For (0..10).step(2)
it yields 0, 2, 4, 6, 8
.
So the current implementation of Step::step_by
behaves like Python's, except when the step is 0. Python throws an exception in that case. If we'd like to follow the current behavior and also keep it aligned with other languages, I think it should behave like what described above and panic when the step is 0. The current PR does exactly that.
So all that's left is to decide on the name. I've talked with some other people, and it seems they prefer step
, every
and step_by
. The pros of using step_by
is that if there's some code that uses it already, then it won't break. But the other two are perfectly valid too. every
is meant to be used as .every(2)
, which is like every_nth(2)
but shorter and easier to read and say. step
is the name that the itertools crate chose.
Maybe we could do a poll on this and let the community decide what they like best? Rust weekly comes out tomorrow, so if we can get a poll going soon, then we might be able to put a link to it on tomorrow's weekly.
I'd probably avoid step
because I think it'd break all itertools users (which would be bad!). Between every
and step_by
I'd lean towards step_by
personally, but I don't have much of a strong opinion.
To replace bad looking and not-DRY code like this ("deg" is repeated three times and it's still present in the scope after the loop end):
let mut deg = -90.0;
while deg < 90.0 {
deg += 1.5;
}
Is it possible and a good idea to use step_by() on an interval of floating point values too?
for deg in (-90.0 .. 90.0).step_by(1.5) {}
@leonardo-m The problem with that is the reason this issue is open. :)
See the implementation by @jbaum98 for a generator for your case.
I agree that some syntactic sugar would be good to have in the language for it, but it should not be step_by(), as it would lead to confusion.
I am not sure what that sugar could be, to be honest.
Maybe (n..m , s)
but this could look like a tuple with a range and a number, or (n..m)*s
(because "by" in english can mean both multiplication and step), or (n..m)/s
(to mean the range but in pieces of s size).
Edit: a simple impl could be enough, but it must be done in the std::ops:
impl<T, S> Div<S> for Range<T>
where S: Copy
{
type Output = ExclusiveRange<T, S>;
fn div(self, rhs: S) -> Self::Output {
ExclusiveRange {
start: self.start,
stop: self.end,
step: rhs,
}
}
}
for i in (0.5..12.7) / 0.23 {
// do something with i
}
Syntax sugar is sometimes nice, but in this case a Prelude or standard library thing that allows to write this is enough for me:
for deg in FPInterval(-90.0, 90.0, 1.5) {} // f64
for deg in FPInterval(-90f32, 90.0, 1.5) {} // f32
Is it possible and a good idea to use step_by() on an interval of floating point values too?
There's also the possibility of accumulating floating point error through repeated addition, depending on the specific values chosen. Only you can tell if those errors are acceptable for your case. Were I writing the code, I'd probably do some precomputation and then iterate over integers, converting to floating point repeatedly:
for deg in (0..120).map(|d| -90.0 + (d as f64 * 1.5)) {
println!("{}", deg);
}
Sadly, it's not quite as readable, and I don't know if the multiplication will have any less possibility of having error than the repeated addition (although I hope it would).
@shepmaster couldn't that be how it's implemented under the hood for floats?
how it's implemented under the hood for floats?
I'm not certain, but I don't believe so. That first example nicely divides into integers. A theoretical -10.0 to 10.0 with increments of 3.14 would not. This ends up "losing" the last value in my naive algorithm:
fn fake_step(start: f64, end: f64, step: f64) {
let range = end - start;
let n = (range / step) as usize;
for deg in (0..n).map(|n| start + (n as f64 * step)) {
println!("{}", deg);
}
}
fn main() {
fake_step(-90.0, 90.0, 1.5);
fake_step(-10.0, 10.0, 3.14);
}
Although perhaps a .ceil
"solves" the problem; but then there's a lot of magic happening behind the scenes. Something like this could be put into a crate, of course, if you are willing to accept the downsides of the implementation.
See also how others do it:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html
Yeah, linspace
is the way to go for that problem.
for i in linspace(-90f32, 90f32, 120) {
// do stuff
}
But there's crates that offer this already. I don't know if it would be a good idea to bring it into Rust core, especially when there's so many different ways to implement it.
The idea here is to make step_by
general for any Iterator
, which means it can only accept usize
s as the step, especially because a 1.5 step isn't defined for a vector iterator :P
I don't know if it would be a good idea to bring it into Rust core, especially when there's so many different ways to implement it.
Stepping on floating point interval isn't a very common operation in Rust. But I have had to do it in Rust. I think it's basic enough, and having multiple ways to do it is one of the reasons to have a standard common way to do it.
I agree that a function like fpinterval() is a topic different from the step_by.
Great to see this making progress! I like the "always take usize and figure out the math stuff later" approach đź‘Ť
Couldn't we completely remove the Step trait too?
The step trait is also used for non-StepBy
iteration of ranges today, so it probably shouldn't be completely removed right away. I think these are the only trait methods that should be removed (by
is the step amount, and is_negative
made (10..1).step_by(-1)
"work"):
fn step(&self, by: &Self) -> Option<Self>;
fn steps_between(start: &Self, end: &Self, by: &Self) -> Option<usize>;
fn is_negative(&self) -> bool;
I think that will be a big help to eventually stabilizing letting people iterate their own types in a Range
, though. (For example, dates have reasonable pred/succ operations, but .step_by(&Date::parse("2017-05-01").unwrap())
is nonsense, so yay for deleting it!)
Now there are two StepBy
structs in the docs after #41439 has been merged.
Yeah, now we need to make another PR to remove that deprecated StepBy
struct. I can work on that no problem. I just need to know if it's ok to completely remove the Step
trait or if we should keep it around.
@photino It's already being corrected in #42110
Tracking for Step
has moved to https://github.com/rust-lang/rust/issues/42168, so can be removed from the list in the OP.
In Servo, we rely on Range::step_by
to have a single type for normal ranges and reversed ones. How do I do that now? I'm pretty sure we don't want to box iterators in the middle of nowhere in the layout crate.
Cc @pcwalton
@nox I think that if you do my_vec.iter().rev().step_by(2)
it should work as intended and only have one single type StepBy
.
@nox While Range::step_by certainly worked there, the fact that step_by.abs() == 1
makes me feel like it was never a perfect match for the semantics you need. And that it forced a manual range reversal with extra -
1
s makes it seem awkward at best.
I think it'd be clearer with something more focused on "single type for normal ranges and reversed ones":
let fragment_indices = (range.begin().get()..range.end().get()).rev_if(reverse);
Proof-of-concept: https://play.rust-lang.org/?gist=0c7dfce47adf5fc5b2683e8561acd869
So, I still think step_by(0) = repeat
makes a lot of sense. I even thought of a language example that allows making arbitrary ranges infinite: Haskell ([start, start, ..]
), making it “consistent” and all that.
I'm against step_by(0)
meaning repeat:
StepBy
being ExactSizedIterator
everiter::repeat
nor Iterator::cycle
. I'd rather people choose repeat(it.next().unwrap())
or it.take(1).cycle()
to say what they mean.Basically, the "produces an infinite iterator" behaviour feels too qualitatively different to me from the "skip a subset of the items" behaviour for it to make sense in the same method & type.
Note that there's even a Clippy lint against using the existing .step_by(0)
: https://github.com/Manishearth/rust-clippy/wiki#range_step_by_zero
Hi there,
I do not understand the rational behind the decision of deprecating Range::step_by
. Even though the new generic Iterator::step_by
does provide an equivalent apparent functionality, it is actually a huge loss in performances.
Consider the following program:
for i in (1..1000000000).step_by(100000000) {
println!("{}", i)
}
Even though there is almost nothing to do (i.e., printing 10 integers), the new implementation has to loop over all the integers between 1 and 1000000000, which takes a lot of time. The previous implementation actually did what I expected (i.e., incrementing by 100000000). Note that I am actually measuring this performance issue in release mode.
Now, you could tell me that I could have written something like:
for i in (0..).map(|x| x*100000000+1).take_while(|x| x < 1000000000) {
println!("{}", i)
}
But we are still relying on the optimizer's ability to transform multiplications into successive additions. Are we sure that LLVM will always perform such an optimization?
Moreover this is much more verbose, and clearly less understandable than the corresponding code in C:
for(int i = 1; i < 1000000000; i += 100000000) { ... }
This is "just° a lack of specialisation. There's nothing preventing us from
adding just as optimal code path for ranges as before.
On Jun 7, 2017 13:57, "Jacques-Henri Jourdan" notifications@github.com
wrote:
Hi there,
I do not understand the rational behind the decision of deprecating
Range::step_by. Even though the new generic Iterator::step_by does
provide an equivalent apparent functionality, it is actually a huge loss in
performances.Consider the following program:
for i in (1..1000000000).step_by(100000000) {
println!("{}", i)
}Even though there is almost nothing to do (i.e., printing 10 integers),
the new implementation has to loop over all the integers between 1 and
1000000000, which takes a lot of time. The previous implementation actually
did what I expected (i.e., incrementing by 100000000). Note that I am
actually measuring this performance issue in release mode.Now, you could tell me that I could have written something like:
for i in (0..).map(|x| x*100000000+1).take_while(|x| x < 1000000000) {
println!("{}", i)
}But we are still relying on the optimizer's ability to transform
multiplications into successive additions. Are we sure that LLVM will
always perform such an optimization?Moreover this is much more verbose, and clearly less understandable than
the corresponding code in C:for(int i = 1; i < 1000000000; i += 100000000) { ... }
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/27741#issuecomment-306761043,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AApc0obsmZDr3uCFyvLwY5aoTmNYSmmFks5sBoIJgaJpZM4FqaVv
.
Oh, sorry, I did not know that specialization was possible in Rust (after all, that's an unstable feature!)
In any case, is specialization for step_by
actually planned? It requires a few changes to the current interface, since the return type will no longer be always StepBy
.
I agree that such specialization is not an optional part of this new step_by, it's an essential part.
I think it doesn't need real specialization, just add a custom nth
in impl Iterator for Range
.
I am trying to use the new step_by()
of std::trait::iterator
:
https://doc.servo.org/core/iter/trait.Iterator.html#method.step_by
instead of step_by
of Range
:
https://doc.servo.org/core/ops/struct.Range.html#method.step_by
This is my code:
error: use of unstable library feature 'step_by': recent addition (see issue #27741)
--> /home/jklepatch/os/servo/components/layout/inline.rs:970:66
|
970 | (range.end().get() - 1..range.begin().get() - 1).step_by(-1);
| ^^^^^^^
|
= help: add #![feature(step_by)] to the crate attributes to enable
I have also replaced #![feature(step_by)]
by #![feature(iterator_step_by)]
at the top of the crate. Finally, I have tried with and without a statement use std::iter:iterator;
(might not be necessary, I am new to rust and dont understand very well whats going on).
Is this the normal behavior?
I’ve submitted #43077 to override nth
. I think #42310 and #43012 should not have landed without this, but oh well.
I’d like to add Iterator::nth_back
and use it in iter::Rev<_>::nth
in order to fix the same performance issue in (a..b).rev().step_by(n)
. What do you all think?
Follow-up PR to redesign the Step
trait and some other changes to related iterator impls for ranges: https://github.com/rust-lang/rust/pull/43127
Are there any open questions for the current Iterator::step_by
? These original ones seem addressed. The question of negative steps => defined out by using the Iterator
interface.
Currently step_by is probably slow ( https://github.com/rust-lang/rust/issues/43064 ), so I suggest to stabilize it only after (and if) it becomes fast enough.
Regarding the design, see also:
https://users.rust-lang.org/t/some-more-about-step-by-troubles/11975
@leonardo-m Good points, I think those are open questions. After having a go at it, I can see that trying to special case .step_by()
for numeric ranges with a usize step size runs into all the unfun stuff we want to avoid in Rust. Mixing integer sizes just leads to mismatches in several ways.
.step_by(usize)
appropriate for numeric ranges?I think step_by
is good, it's just not made for ranges. It has the same behavior as nth(). It's perfectly sensible that it behaves the way it does today.
We could simply make *By
range variants, like RangeBy
, RangeFromBy
, etc. Maybe a RangeByArgument
trait to match and make it a supertrait of RangeByArgument
.
Optionally, the regular Range
variants with an *By
counterpart could have a by
method like:
fn by(self, Idx) -> RangeBy
(on Range
) so you can do (1..100).by(2)
.
An ergonomical syntax can be decided at a later point, but at least functionality will be there.
I agree that Iterator::step_by
is good, regardless of whether it is the solution for ranges. Let’s stabilize it.
@rfcbot fcp merge
The tracking issue for range (stepped) iterator is https://github.com/rust-lang/rust/issues/42168
Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged teams:
No concerns currently listed.
Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
Is there an RFC or post or something that documents the behavior being proposed for stabilization?
I find it extremely hard to extract from this issue what is exactly being proposed for stabilization, what is the rationale, alternatives that have been explored, etc.
I am not suggesting that this needs an RFC, but at least a summary of the work done and why what we have now is better and suitable for stabilization that what we had before. As a maintainer of collections implement step_by I would find such information useful to update my collections accordingly.
@gnzlbg Yeah, this issue has a lot of history and can be hard to read now. Iterator::step_by(self, n: usize) ->Â StepBy<Self>
returns a new iterator that is based on the inner iterator’s nth
method:
https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.step_by
Creates an iterator starting at the same point, but stepping by
the given amount at each iteration.Note that it will always return the first element of the iterator,
regardless of the step given.Panics
The method will panic if the given step is
0
.Examples
Basic usage:
#![feature(iterator_step_by)] let a = [0, 1, 2, 3, 4, 5]; let mut iter = a.into_iter().step_by(2); assert_eq!(iter.next(), Some(&0)); assert_eq!(iter.next(), Some(&2)); assert_eq!(iter.next(), Some(&4)); assert_eq!(iter.next(), None);
@SimonSapin thanks for the summary, that is really helpful.
I agree that step_by is good for iterators in general, but we risk a situation where we have to advise not using .step_by on ranges, which is going to be uphill ergonomically. I had hopes .step_by(usize)
would be straightforward for ranges, but it's a bit complicated.
With the right tooling and right method name, a “increment type matching” stepping adaptor for ranges might still be easy to find for users.
Problems with step_by
for ranges include such things as:
Range<i16>
. We figure out what an "equivalent" step that fits inside an i16 is and adjust the step accordingly.Could two different named methods ("step" and step_by"?) solve this issue, having one for ranges that takes an usize, and one for intervals that takes the same type as the interval?
Can't we have the step
parameter be an associated type? Then on the generic Iterator
we can have it be usize
and for ranges we override it with Idx
? Like
trait Iterator {
type Item;
type Idx = usize;
/// ...
}
impl Iterator for Range<Idx=u16> {
type Idx = Self::Idx;
/// ...
}
Or something similar is this doesn't work out or if it's not desirable to add a new associated type to the Iterator
trait.
@ivandardi If we did that Range<u16>
wouldn't implement Iterator<Item = u16>
anymore.
@bluss How about renaming step_by
to every
or every_nth
since it relies on nth
. I agree the word "step" could bring confusion. The method is valuable on its own though even if it's not the solution for number ranges.
I’m ok with every_nth
/ EveryNth
.
Semantics and naming choices like every_nth
were discussed in PR #41439. I think it would be confusing to name this similar to nth
, since they're off by one -- step_by(x)
is the same as next()
then repeating nth(x-1)
.
Then how about every
.
:bell: This is now entering its final comment period, as per the review above. :bell:
The final comment period is now complete.
I looked at overriding StepBy::try_fold
to see if it would help. It does seem to help, but the biggest thing for me was that it clarified what made it hard for LLVM: the inner loop for (a..b).step_by(7).map(|x| x ^ (x - 1)).sum()
is:
mov r8d, eax
lea r9d, [rcx - 1]
lea eax, [rcx - 2]
xor eax, r9d
add eax, r8d
mov r9d, ecx
add r9d, 6
setb r8b
cmp r9d, edx
jae .LBB1_6
add ecx, 7
test r8b, r8b
je .LBB1_4
which doesn't simplify further because of the slightly-awkward semantics of nth
. It needs to return the 6th item and step by 7. (So the Range iterator is doing both Step::add_usize(6)
and Step::add_one()
.)
I think this'd be perfectly efficient in terms of a "return next and also step forward" method, since the Range iterator for that would just return start
and replace it with start.add_usize(7)
. I dunno if there'd be an appetite for such a thing, though...
Could we amend a note to step_by
allowing us to use the pre-stepping semantics that @scottmcm spoke of? Given that iteration can have side-effects, I fear we may be overconstraining ourselves with regards to range iterators. While the current side-effect free integer ranges can deal with this, RangeFrom
would already be changing semantics with pre-stepping because of overflow. It could affect additional situations when the Step
trait will be stabilized.
While specializations of StepBy<Range*<_>>
catch the most common case performance issues, they miss all the more complicated ones where the Range
is hidden within an adapter.
Most helpful comment
Please just stabilize this. I don't care if it's a mess. It's basic functionality.