go version)?1.12.1
Yes
go env)?linux/amd64
Using strings.ToLower on strings in upper or lower case:
https://play.golang.org/p/eyPCCj_zCtF
As per the documentation (https://github.com/golang/go/blob/master/src/strings/strings.go#L583), have the same behaviour for both cases: getting a new copy of the string
Strings with only lower case do not get copied, the original string is kept
This seems to be a result of an optimisation, https://github.com/golang/go/commit/13cfb15cb18a8c0c31212c302175a4cb4c050155, and in particular the fully optimized path of lower case ASCII: https://github.com/golang/go/blob/master/src/strings/strings.go#L597
As string are immutable, it doesn't seem to be a huge issue (except that the documentation is wrong), but as far as I understand this can result in pseudo memory leaks if one relies (as per the documentation) on ToLower to do a copy of small slices of larger strings (and only keeping references to the lower case slice)
Nice catch. Although the commit you pointed to is for the ToUpper one.
My vote is to document this behavior. We don't need to unnecessarily copy the string when it's not required. Even Map has a fast path that returns the original string instead of a copy. https://github.com/golang/go/blob/master/src/strings/strings.go#L497
@bradfitz @dsnet
At the language level, there's no difference between a string and a copy of that string. They may retain different amounts of memory for their backing store, but memory usage is not a language-level concept.
So I don' think it warrants changing the docs for strings.ToLower. I'm sure there are 10 other functions that do similar optimizations. We don't want people using strings.ToLower because of its copying side-effect.
That said, we understand that it's occasionally necessary to copy strings to free a large backing store. We've been musing about a way to do that. I think we probably need a runtime function which guarantees a new backing store (or knows that the current one is small enough).
Currently, string([]byte(s)) makes a new string. Probably not wise to rely on that behavior.
Makes sense. But the OP has a point. If the docs say that a function returns a copy, but it actually doesn't in some cases, the the docs are _technically incorrect_.
We don't want people using strings.ToLower because of its copying side-effect.
Agree, but this needs to be documented somewhere as a general pointer. Perhaps in the package overview or somewhere else.
Maybe the docs of strings.ToLower is copied from bytes.ToLower. ;D
Maybe the docs of strings.ToLower is copied from bytes.ToLower. ;D
Yes, I think that's the case. We should remove "a copy" from the strings version, as it doesn't mean anything.
Change https://golang.org/cl/171735 mentions this issue: strings: documenting the case when ToUpper/ToLower can return original string
Most helpful comment
At the language level, there's no difference between a string and a copy of that string. They may retain different amounts of memory for their backing store, but memory usage is not a language-level concept.
So I don' think it warrants changing the docs for
strings.ToLower. I'm sure there are 10 other functions that do similar optimizations. We don't want people usingstrings.ToLowerbecause of its copying side-effect.That said, we understand that it's occasionally necessary to copy strings to free a large backing store. We've been musing about a way to do that. I think we probably need a runtime function which guarantees a new backing store (or knows that the current one is small enough).
Currently,
string([]byte(s))makes a new string. Probably not wise to rely on that behavior.