Runtime: Proposal: Stream.HasLength property

Created on 15 Oct 2020  路  11Comments  路  Source: dotnet/runtime

Description

There currently is no reliable way to know whether Stream.Length can be used, as checking CanSeek only guarantees that Length will work if it returns true, but gives no guarantees if it returns false: there can be cases where CanSeek is false but Length works just fine. This means that the only way to check the Length is to use a try/catch block, which is not ideal. Being able to access the length allows for a number of optimizations, such as renting a buffer the size of the whole stream, to read its contents in a single pass without wasting cycles expanding the buffer to write to, and making more copies and reads.

Proposed API:

namespace System.IO
{
    public abstract class Stream
    { 
        public abstract bool HasLength { get; }
    }
}

Original description

The Stream returned by HttpContent.ReadAsStreamAsync sometimes has CanSeek set to false, but accessing the Length property works just fine. This seems to directly contradict the docs (here):

If a class derived from Stream does not support seeking, calls to Length, SetLength, Position, and Seek throw a NotSupportedException.

This remark implies that !CanSeek ==> Length throws, but this is not always true in practice. To work around this, I ended up using an ugly try/catch block in the System.Text.Json-based JSON serializer in the refit library: here. As mentioned above, not really sure how to classify this issue between a bug report and a feature proposal, as it depends on whether or not this is intended behavior. As in, I would say there are two possibilities here:

  • If the remarks in the docs are correct and !CanSeek ==> Length throws, then it seems like there is a bug in HttpContent.
  • If the remarks are not correct, then:

    • We should update them

    • I would ask to add a new Stream.HasLength property (this was actually @tannergooding's idea), that will allow developers to just check whether Length can be accessed, regardless of whether or not the whole stream is actually seekable. This is particularly useful whenever you'd like to just allocate a buffer to read the whole stream in memory in one go, without having to use a buffer writer which will likely result in less efficient code due to buffer expansions and copy operations.

Related issues/PRs

Configuration

  • OS: Windows 10
  • Version: .NET Core 3.1.2
api-suggestion area-System.IO

Most helpful comment

which means that it would have the exact same issue for any existing stream that returns valid length even when CanSeek is false until (if) that stream gets around to overriding the new virtual.

It could also do a try/catch and return `true if the length is not negative, not zero, or if it succeeds at all. Which is more correct, just slow for the default implementation.

All 11 comments

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Instead of calls to Length, SetLength, Position, and Seek throw a NotSupportedException, the docs should really say calls to Length, SetLength, Position, and Seek may throw a NotSupportedException. In other words, CanSeek is there to govern whether you can expect Length/SetLength/Position/Seek to work: if it returns true, they should work, but if it returns false, there aren't any guarantees made one way or the other.

Thank you for clarifying, I'll go open a PR in the docs repo then! 馃槃

Do you reckon a separate Stream.HasLength property would be a reasonable API to add?
Should I just edit this issue to turn it into a proposal for that?

Do you reckon a separate Stream.HasLength property would be a reasonable API to add?

It's certainly fine to have a discussion about, and this issue can be used for that. We do want to be very careful about any new virtuals we add to Stream and that they really add enough value to be worth their weight, and need to think through what the default implementations would be. For example, Stream.HasLength's base implementation would probably need to be HasLength => CanSeek, which means that it would have the exact same issue for any existing stream that returns valid length even when CanSeek is false until (if) that stream gets around to overriding the new virtual.

For example, Stream.HasLength's base implementation would probably need to be HasLength => CanSeek, which means that it would have the exact same issue for any existing stream that returns valid length even when CanSeek is false until (if) that stream gets around to overriding the new virtual.

Right, yeah that'd be inconvenient 馃
I would imagine at least all the BCL streams would support it properly from the start, but yeah that's a very good point.
I'll update the title/description of the issue and leave it open for future discussion then, thank you for your time Stephen! 馃槉

which means that it would have the exact same issue for any existing stream that returns valid length even when CanSeek is false until (if) that stream gets around to overriding the new virtual.

It could also do a try/catch and return `true if the length is not negative, not zero, or if it succeeds at all. Which is more correct, just slow for the default implementation.

Related: Length == 0 means "unknown length" for some streams today.

If you run this on Linux today:

var fileStream = new FileStream("/proc/self/stat", FileMode.Open, FileAccess.Read, FileShare.Read, bufferSize: 1, useAsync: false);
Console.WriteLine(fileStream.CanSeek);
Console.WriteLine(fileStream.Length);
Console.WriteLine(File.ReadAllBytes("/proc/self/stat").Length);

The output is:

0
324

Would this API account for this?

@jkotas the main description is accounting for the case you shared:

_CanSeek only guarantees that Length will work if it returns true_

Not only CanSeek returned true, but also Length didn't throw an exception, and returned a valid value, zero, because it's a special file. HasLength should return true for that case.

Right, it means that there is no good way to provide base implementation that works well. How are people calling this API going to figure out whether they are calling base implementation that may return bogus result; or actual implementation that will be correct?

FWIW, I do not think that the proposal in its current form is ready for review.

I agree. As it stands, we shouldn't add this.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Timovzl picture Timovzl  路  3Comments

matty-hall picture matty-hall  路  3Comments

aggieben picture aggieben  路  3Comments

omariom picture omariom  路  3Comments

jkotas picture jkotas  路  3Comments