Rfcs: Trim methods on slices

Created on 23 Sep 2018  路  13Comments  路  Source: rust-lang/rfcs

/// Trims this slice from the left.
fn trim_left_matches<F: Fn(T) -> bool>(&self, f: F) -> &[T] {
    let mut res = self;
    while res.len() > 0 && f(res[0]) {
        res = res[1..];
    }
    res
}

/// Trims this slice from the right.
fn trim_right_matches<F: Fn(T) -> bool>(&self, f: F) -> &[T] {
    let mut res = self;
    while res.len() > 0 && f(res[res.len()-1]) {
        res = res[..(res.len()-1)];
    }
    res
}

(and so on)

basically turns &["", "", "", "foo", ""] into &["foo", ""], &["", "foo", "", "", ""] into &["", "foo"], etc, depending on what you call.

T-libs

Most helpful comment

That's even less readable. :/

Noise is having the same code snippet over and over again and not having it in a well-documented standalone function.

Also, I don't think that code of yours actually works. You probably meant to use take_while and split_at.

All 13 comments

These sounds like noise since you implement them trivially with code like s.split_at_mut(s.len() - s.iter().rev().filter(|x| x.len()==0).count()).0.

That's even less readable. :/

Noise is having the same code snippet over and over again and not having it in a well-documented standalone function.

Also, I don't think that code of yours actually works. You probably meant to use take_while and split_at.

We have .skip_while() and .take_while() for iterators. Aren't those enough?

.iter().skip_while(|x| x == "").take_while(|x| x != "")

No - that doesn't work like a trim method.

And they're not analogous to the string methods.

@SoniEx2 Can you give an example where this would be useful (except for Strings)?

When you have a slice and don't want to allocate.

I ended up needing something like this for c-string parsing. I have a sequence of bytes and want to return the prefix containing the c-string data (not including the null terminator).

But then I realized, you can use split to do this:

fn trim_c_string(s: &[u8]) -> &[u8] {
    s.split(|&b| b == 0).next().unwrap_or(&[])
}

However, this implementation cannot eliminate the bounds check unlike the naive loop implementation:

pub fn fast_trim_c_string(s: &[u8]) -> &[u8] {
    for i in 0..s.len() {
        if s[i] == 0 {
            return s.split_at(i).0;
        }
    }
    s
}

It's nice to have trim methods on str but in the project I am working on right now, I use &[char] slices instead of &str, because I need indexed access to characters and slicing of strings which &str does not support since it's UTF-8. It is disturbing that str has a .trim() method and a generic [T] slice does not. Would be really nice if this issue was resolved, all the more so it is that easy to implement.

A sample implementation looks like this though I am sure it is suboptimal.

fn trim<P>(&self, mut predicate: P) -> &[T]
where
    P: FnMut(&T) -> bool,
{
    let mut left = 0;
    let mut right = self.len();

    let mut iter = self.iter();

    while let Some(e) = iter.next() {
        if predicate(e) {
            left += 1
        } else {
            break;
        }
    }

    while let Some(e) = iter.next_back() {
        if predicate(e) {
            right -= 1
        } else {
            break;
        }
    }

    &self[left..right]
}

We prefer split_* methods for slices, so as to retain access to underlying subslices, so I still think trim_* methods add noise. We could discuss some split_change(f) that does split_inclusive(|x| changed(f(x))) where

let mut previous = true;
let changed = |x| if previous == x { false } else { previous=x; true };

so trim is split_change(f).skip(1).next().unwrap_or(&[]). We're maybe better off adding roughly this changed state machine somewhere like core::iter though, not sure.

perhaps a more useful trim would use Default::default() to remove things.

@serid you should really be using &[&str] instead of &[char] because &[char] is useless.

For information, there is a new-ish unstable API split_inclusive on slices that would help implementing such a thing. (The normal split API doesn't include the "split marker" in either of resulting sub slices. More info here: https://github.com/rust-lang/rust/pull/67330) However, I neglected making a tracking issue, so there isn't a direct path toward stabilization at the moment. I'll try to scrape some time to create a tracking issue the next weekend!

I have a use case- I'm currently working on updates to BufWriter, and specifically its implementation of write_vectored, as a part of https://github.com/rust-lang/rust/issues/78551. write_vectored takes an &[IoSlice], and it'd be very useful to be able to trim empty slices from both ends. This would allow me to forward the trimmed list of slices to the inner write_vectored method, and also to specifically specialize the case where we received exactly 1 non-empty slice. These cases aren't served by iterator methods, because I need to transform slices into smaller slices to process & forward as necessary.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

3442853561 picture 3442853561  路  3Comments

marinintim picture marinintim  路  3Comments

steveklabnik picture steveklabnik  路  4Comments

Diggsey picture Diggsey  路  3Comments

camden-smallwood-zz picture camden-smallwood-zz  路  3Comments