Please answer these questions before submitting your issue. Thanks!
go version
)?go1.11beta1
Yes, with the latest beta.
go env
)?GOOS="linux"
GOARCH="amd64"
Ran all tests from a closed source monorepo with go1.11beta1
and one test fails with the new beta.
The test is somewhat bigger but the core error is that a config struct that holds compiled regexps among other things gets marshaled to json and unmarshaled back.
reflect.DeepEqual
is used to test the equality of the result.
Easiest way I could find to reproduce: https://play.golang.org/p/5WB8WJbQG15
go1.10
returns true
, while go1.11beta1
returns false
result true
for go1.11beta1 as it is in go1.10
result false
I don't think reflect.DeepEqual
is even a good idea here but it's still a behavior change.
This is most likely caused by https://github.com/golang/go/commit/7dbf9d43f5a62a604ab3e6ceb1ee7ac4f3a80d80.
CC @jkohen @bradfitz @dsnet
I think this is a change that should be documented in the release notes. I think it was an unfortunate accident that reflect.DeepEqual
ever worked correctly on compiled regexps. I don't see why that is a behavior that we should preserve.
I'm open to counter-arguments.
I'd just document it.
Would be ok for me as well, just wanted to raise awareness since reflect.DeepEqual
gets used / abused a lot in tests in the wild.
I support documenting it. The regexp.Regexp
change also broke a number of tests inside Google. We switched such cases over to the cmp
package with a custom comparer to only compare the regexp string:
cmp.Comparer(func(x, y *regexp.Regexp) bool {
if x == nil || y == nil {
return x == nil && y == nil
}
return x.String() == y.String()
})
I am concerned about the disregard shown for practical compatibility here.
I support documenting it. The
regexp.Regexp
change also broke a number of tests inside Google.
So both a bunch of Google tests and some outside-Google tests broke. That makes it an incompatible change. To force an incompatible change through into a release, there has to be a compelling reason, usually a _correctness_ reason. Being a little faster in a niche use case is not sufficient.
Compatibility is more than just type system compatibility. Type system compatibility is just the bare minimum that we can automate. When we accidentally introduce other incompatibility, and we find out via bug reports and the like, then we need to step back and think carefully about whether inflicting the cost of this particular incompatibility on users is worth the cost of the corresponding benefit.
The benefit here is very minimal - again, faster in a niche use case. Performance almost never justifies breaking actual uses.
This change should be rolled back. I will look into whether there is a different way to save some of the performance win without breaking the uses. Either way, I will send a CL restoring the old behavior.
/cc @dsnet @bradfitz @andybons
I don't think it's a niche performance improvement.
If we had a fast regexp implementation we wouldn't need extra API like https://golang.org/pkg/regexp/#Regexp.Copy
Instead we need to push such worries to users.
If I had to pick between people needing to think about when to use Regexp.Copy vs a few tests breaking, I'd still pick a few tests breaking.
But given that we're stuck with Regexp.Copy
, what if we overload Copy
to also mean that it returns a *Regexp
that uses the machines *sync.Pool
, but by default that field is nil
and uses the old slow way?
Then people defining global variables can say:
var fooPattern = regexp.MustCompile(`....`).Copy()
Change https://golang.org/cl/122495 mentions this issue: regexp: preserve property that a Regexp can be compared with reflect.DeepEqual
Alternatively, https://go-review.googlesource.com/c/go/+/122495 is even simpler: it just lazily initializes the machines field, so DeepEqual
still works before the regexp has been used at least, which should fix most tests.
I still fundamentally object to a per-Regexp sync.Pool. They really don't belong in individual objects. That's a huge amount of allocation to stick in one regexp.
If I had to pick between people needing to think about when to use Regexp.Copy vs a few tests breaking, I'd still pick a few tests breaking.
This is a false dichotomy. The choice is almost never (1) faster and break users vs (2) slow and no broken users. It's almost always possible - and always preferable - to find a way to do (3) faster with no broken users, even if that's a little bit more work.
My point of raising the general concern was just to remind everyone to think about "how can we keep the improvement and not break users?" instead of "what words should we write to tell users how we broke their code?" And if there's no time to do the work to not break users, better to roll back the change and try to get the performance win again next cycle.
(But I have a CL that's faster and makes regexp.Regexp completely DeepEqual-safe, more than it was before. Running benchmarks and then will send it.)
Change https://golang.org/cl/122596 mentions this issue: regexp: revert "use sync.Pool to cache regexp.machine objects"
Change https://golang.org/cl/139783 mentions this issue: regexp: add DeepEqual test
Most helpful comment
This is a false dichotomy. The choice is almost never (1) faster and break users vs (2) slow and no broken users. It's almost always possible - and always preferable - to find a way to do (3) faster with no broken users, even if that's a little bit more work.
My point of raising the general concern was just to remind everyone to think about "how can we keep the improvement and not break users?" instead of "what words should we write to tell users how we broke their code?" And if there's no time to do the work to not break users, better to roll back the change and try to get the performance win again next cycle.
(But I have a CL that's faster and makes regexp.Regexp completely DeepEqual-safe, more than it was before. Running benchmarks and then will send it.)