Go: strings: ToLower not returning a copy for strings already in lower case

Created on 21 Mar 2019  路  6Comments  路  Source: golang/go

What version of Go are you using (go version)?

1.12.1

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

linux/amd64

What did you do?

Using strings.ToLower on strings in upper or lower case:
https://play.golang.org/p/eyPCCj_zCtF

What did you expect to see?

As per the documentation (https://github.com/golang/go/blob/master/src/strings/strings.go#L583), have the same behaviour for both cases: getting a new copy of the string

What did you see instead?

Strings with only lower case do not get copied, the original string is kept

Comments

This seems to be a result of an optimisation, https://github.com/golang/go/commit/13cfb15cb18a8c0c31212c302175a4cb4c050155, and in particular the fully optimized path of lower case ASCII: https://github.com/golang/go/blob/master/src/strings/strings.go#L597

As string are immutable, it doesn't seem to be a huge issue (except that the documentation is wrong), but as far as I understand this can result in pseudo memory leaks if one relies (as per the documentation) on ToLower to do a copy of small slices of larger strings (and only keeping references to the lower case slice)

Documentation FrozenDueToAge NeedsFix

Most helpful comment

At the language level, there's no difference between a string and a copy of that string. They may retain different amounts of memory for their backing store, but memory usage is not a language-level concept.

So I don' think it warrants changing the docs for strings.ToLower. I'm sure there are 10 other functions that do similar optimizations. We don't want people using strings.ToLower because of its copying side-effect.

That said, we understand that it's occasionally necessary to copy strings to free a large backing store. We've been musing about a way to do that. I think we probably need a runtime function which guarantees a new backing store (or knows that the current one is small enough).

Currently, string([]byte(s)) makes a new string. Probably not wise to rely on that behavior.

All 6 comments

Nice catch. Although the commit you pointed to is for the ToUpper one.

My vote is to document this behavior. We don't need to unnecessarily copy the string when it's not required. Even Map has a fast path that returns the original string instead of a copy. https://github.com/golang/go/blob/master/src/strings/strings.go#L497

@bradfitz @dsnet

At the language level, there's no difference between a string and a copy of that string. They may retain different amounts of memory for their backing store, but memory usage is not a language-level concept.

So I don' think it warrants changing the docs for strings.ToLower. I'm sure there are 10 other functions that do similar optimizations. We don't want people using strings.ToLower because of its copying side-effect.

That said, we understand that it's occasionally necessary to copy strings to free a large backing store. We've been musing about a way to do that. I think we probably need a runtime function which guarantees a new backing store (or knows that the current one is small enough).

Currently, string([]byte(s)) makes a new string. Probably not wise to rely on that behavior.

Makes sense. But the OP has a point. If the docs say that a function returns a copy, but it actually doesn't in some cases, the the docs are _technically incorrect_.

We don't want people using strings.ToLower because of its copying side-effect.

Agree, but this needs to be documented somewhere as a general pointer. Perhaps in the package overview or somewhere else.

Maybe the docs of strings.ToLower is copied from bytes.ToLower. ;D

Maybe the docs of strings.ToLower is copied from bytes.ToLower. ;D

Yes, I think that's the case. We should remove "a copy" from the strings version, as it doesn't mean anything.

Change https://golang.org/cl/171735 mentions this issue: strings: documenting the case when ToUpper/ToLower can return original string

Was this page helpful?
0 / 5 - 0 ratings