Google-cloud-go: Spanner Healthchecker memory consumption

Created on 26 Sep 2017  路  4Comments  路  Source: googleapis/google-cloud-go

We are still trying to track down the problem but in two of our services we have observedspanner.healthChecker consuming much more memory than expected with repeated calls to context.WithDeadline.

(pprof) top10
2308001 of 2340227 total (98.62%)
Dropped 758 nodes (cum <= 11701)
Showing top 10 nodes out of 84 (cum >= 70167)
      flat  flat%   sum%        cum   cum%
   1714972 73.28% 73.28%    1714972 73.28%  context.WithDeadline
    436926 18.67% 91.95%     436926 18.67%  runtime.deferproc.func1
     32768  1.40% 93.35%      65536  2.80%  encoding/asn1.parseField
     32768  1.40% 94.75%      32768  1.40%  github.com/lyft/spannerproxy/vendor/github.com/golang/protobuf/proto.(*Buffer).DecodeStringBytes
     32768  1.40% 96.15%      32768  1.40%  reflect.(*structType).Field
     32768  1.40% 97.55%      32768  1.40%  strconv.formatBits
     16384   0.7% 98.25%      16384   0.7%  syscall.anyToSockaddr
      8192  0.35% 98.60%      40960  1.75%  github.com/lyft/spannerproxy/vendor/github.com/golang/protobuf/proto.(*Buffer).dec_slice_struct
       455 0.019% 98.62%      65991  2.82%  crypto/x509.parseCertificate
         0     0% 98.62%      70167  3.00%  crypto/tls.(*Conn).Handshake

I am digging through the health checker code but any insight here would be helpful. Thanks!

Full heap profile here:

pprof001.svg.zip

spanner p0 bug

Most helpful comment

I'm going to reopen until you confirm the fix.

By the way, excellent debugging info, thanks for that.

All 4 comments

I believe https://code-review.googlesource.com/#/c/gocloud/+/17250 has the fix; it's hard for me to test.

I'm going to reopen until you confirm the fix.

By the way, excellent debugging info, thanks for that.

Thanks for the quick turnaround! That seems to have fixed it.

(pprof) top20
26479 of 26495 total (99.94%)
Dropped 321 nodes (cum <= 132)
Showing top 20 nodes out of 56 (cum >= 5461)
      flat  flat%   sum%        cum   cum%
     16384 61.84% 61.84%      16384 61.84%  github.com/lyft/dbproxy/vendor/google.golang.org/grpc.(*clientStream).SendMsg
      5461 20.61% 82.45%       5461 20.61%  context.WithDeadline
      2048  7.73% 90.18%       2793 10.54%  github.com/lyft/dbproxy/vendor/google.golang.org/grpc/transport.(*http2Server).operateHeaders
      1260  4.76% 94.93%       1260  4.76%  runtime.malg
       745  2.81% 97.75%        745  2.81%  github.com/lyft/dbproxy/vendor/google.golang.org/grpc/transport.(*decodeState).processHeaderField
       410  1.55% 99.29%        410  1.55%  encoding/pem.Decode
       171  0.65% 99.94%        171  0.65%  crypto/tls.(*Conn).readHandshake
         0     0% 99.94%        581  2.19%  crypto/tls.(*Conn).Handshake
         0     0% 99.94%        581  2.19%  crypto/tls.(*Conn).clientHandshake
         0     0% 99.94%        581  2.19%  crypto/tls.(*clientHandshakeState).doFullHandshake
         0     0% 99.94%        410  1.55%  crypto/x509.(*CertPool).AppendCertsFromPEM
         0     0% 99.94%        410  1.55%  crypto/x509.(*Certificate).Verify
         0     0% 99.94%        410  1.55%  crypto/x509.initSystemRoots
         0     0% 99.94%        410  1.55%  crypto/x509.loadSystemRoots
         0     0% 99.94%        410  1.55%  crypto/x509.systemRootsPool
         0     0% 99.94%      16384 61.84%  github.com/lyft/dbproxy/services/proxy.(*Service).Get
         0     0% 99.94%      16384 61.84%  github.com/lyft/dbproxy/services/proxy/grpc/handlers.Get.func1
         0     0% 99.94%      16384 61.84%  github.com/lyft/dbproxy/vendor/cloud.google.com/go/spanner.(*RowIterator).Do
         0     0% 99.94%      16384 61.84%  github.com/lyft/dbproxy/vendor/cloud.google.com/go/spanner.(*RowIterator).Next
         0     0% 99.94%       5461 20.61%  github.com/lyft/dbproxy/vendor/cloud.google.com/go/spanner.(*healthChecker).worker

Woot.

Was this page helpful?
0 / 5 - 0 ratings