Julia: deprecate `String(::IOBuffer)`

Created on 18 Apr 2017  路  21Comments  路  Source: JuliaLang/julia

This constructor method does not seem to be necessary (left over from the bytestring function). It could be seekstart(b); readstring(b) or String(take!(b)).

Actually, String(::IOBuffer) is a bit of a strange function since it gives you a copy of the data in the buffer without affecting the stream state in any way. Does anybody use this?

O deprecation strings

Most helpful comment

Furthermore, although unfortunate, I think it's inaccurate to assume that every IO will contain UTF-8 data. There are still places today where ISO 8859-1 or UTF-16 are used, and having a separate encoding-aware readstring is useful in those cases.

(This is not relevant to IOBuffer, of course, which only contains UTF-8 data by definition.)

All 21 comments

I'm using String(take!(b)) a few times in Cairo.jl runtests, so i'd guess there are other who test if the output to an io stream contains something.

I use String(take!(b)) because the previous version (takebuf_string) was deprecated.

same reason.

BTW, the current readstring is so simple that it could even be deprecated in favor of String(read(b)). We could deprecate String(b::IOBuffer) in favor of String(take!(b)) rather than adding a new readstring method.

Sorry I wasn't clear. There's nothing wrong with String(take!(b)); I was wondering if anybody uses the String(::IOBuffer) method I want to deprecate.

readstring could now avoid allocating an intermediate vector, so that's a reason to keep it. String(read(f)) could end up copying all of the data. In fact it seems to do so in some cases now, which should be fixed!

Furthermore, although unfortunate, I think it's inaccurate to assume that every IO will contain UTF-8 data. There are still places today where ISO 8859-1 or UTF-16 are used, and having a separate encoding-aware readstring is useful in those cases.

(This is not relevant to IOBuffer, of course, which only contains UTF-8 data by definition.)

Furthermore, although unfortunate, I think it's inaccurate to assume that every IO will contain UTF-8 data. There are still places today where ISO 8859-1 or UTF-16 are used, and having a separate encoding-aware readstring is useful in those cases.

Yes, but I figured String(read(b), enc"UTF-16") would be enough in that case.

But indeed if readstring can avoid a copy it's a good reason to keep it.

IOBuffer can contain arbitrary binary data, but String is by definition UTF-8, so it's a good point that the String constructor would not be a good way to read a non-UTF-8 string.

IOBuffer can contain arbitrary binary data, but String is by definition UTF-8, so it's a good point that the String constructor would not be a good way to read a non-UTF-8 string.

As long as the constructor converts the buffer to UTF-8, I don't see the problem. It could also be interesting to be able to create an encoded string in a custom type like EncodedString{:UTF16} (cf. discussions at https://github.com/nalimilan/StringEncodings.jl/pull/9), but that would use a separate constructor.

Yes, I guess that's true. Actually readstring could also be replaced with read(io, String), which would be more consistent with our existing API. An interface like StringType(read(io), ...) should be avoided since it will require copying data in general.

Resolved: will deprecate this to String(take!(copy(iobuf))).

Would it make sense to have take(iobuf) = take!(copy(iobuf))? That would allow the less verbose String(take(iobuf)).

take is deprecated, but could be reintroduced, I suppose?

Most of the usage I saw can just use take! fwiw....

The copy is just there in case the I/O buffer needs to continue to work, which it often doesn't.

I know.... I just mean that most of the time it's not needed so unless it's clear that a lot of the code actually need the copy we don't necessarily need to introduce take.

My encounter with IOBuffer is mostly in the repl code, where it's frequent to need String (::IOBuffer); so the new syntax doesn't only seem verbose, but conceptually more involved than necessary for "give me the string representation of this buffer": make a copy of something and destroy this copy to transform it into a String. But a special purpose private function can also be created where needed (still, I don't see what would be wrong with convert(::Type{String}, ::IOBuffer)).

The problem is that String(::IOBuffer) always copied the data, which is often not necessary. As Yichao says, very often you want just take!, which is more efficient.

So is the plan to reintroduce String(io::IOBuffer) after deprecation to mean String(take!(io))? in this case, String(copy(io)) would be as good as an hypothetical String(take(io)).
Otherwise, convert(String, io) would also work no? convert usually is not destructive of its second argument.

This should not be defined as convert. We don't want this to happen on assignment.

String can't mean String(take!(...)), because there would be a hidden !.

I'm not sure what to make of this; I see why the function is useful, but it's pretty ad-hoc. For example there is no Vector{UInt8}(::IOBuffer). Should we just continue to have the special case?

Was this page helpful?
0 / 5 - 0 ratings