go version go1.8.1 windows/amd64
C:\Users\someuser>go env
set GOARCH=amd64
set GOBIN=
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=C:\users\someuser\code\go
set GORACE=
set GOROOT=C:\Go
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set GCCGO=gccgo
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0
set CXX=g++
set CGO_ENABLED=1
set PKG_CONFIG=pkg-config
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
I am running an application that makes multiple HTTPS calls to different endpoints (it does not listen to any incoming connections). At some point (after 309 minutes in this case) the application will eventually panic . It appears that this occurs because of this chunk of code in transport.go:
go func() {
if trace != nil && trace.TLSHandshakeStart != nil {
trace.TLSHandshakeStart()
}
err := tlsConn.Handshake()
if timer != nil {
timer.Stop()
}
errc <- err
}()
Since this is a go routine I do not have the opportunity to recover from I'm not sure what I can do about this. tlsConn.Handshake() call is what eventually raises the panic.
Here is the panic text (note that this is only the first panic. There were some 200 go routines running so I have not provided the dump of them all):
Exception 0xc0000005 0x0 0x609 0x7ffb1ef5c61f
PC=0x7ffb1ef5c61f
syscall.Syscall6(0x7ffb1ef3ca40, 0x4, 0x4, 0x328e9d0, 0xc042b9f3d0, 0xc042b9f3e0, 0x0, 0x0, 0xc04257cc80, 0x26c8f20, ...)
C:/Go/src/runtime/syscall_windows.go:174 +0x6b
syscall.CertVerifyCertificateChainPolicy(0x4, 0x328e9d0, 0xc042b9f3d0, 0xc042b9f3e0, 0x0, 0xc042e633d8)
C:/Go/src/syscall/zsyscall_windows.go:1208 +0xc1
crypto/x509.checkChainSSLServerPolicy(0xc04211fb00, 0x328e9d0, 0xc042b9f808, 0x34d5110, 0xc042e63548)
C:/Go/src/crypto/x509/root_windows.go:117 +0xfd
crypto/x509.(Certificate).systemVerify(0xc04211fb00, 0xc042e63808, 0x0, 0x0, 0x0, 0x0, 0x0)
C:/Go/src/crypto/x509/root_windows.go:212 +0x484
crypto/x509.(Certificate).Verify(0xc04211fb00, 0xc0421dc4c0, 0x1b, 0xc042345350, 0x0, 0xed11c8f28, 0x2622224, 0x955620, 0x0, 0x0, ...)
C:/Go/src/crypto/x509/verify.go:279 +0x86c
crypto/tls.(clientHandshakeState).doFullHandshake(0xc042b9fe50, 0xc0424aa700, 0x66)
C:/Go/src/crypto/tls/handshake_client.go:300 +0x4c0
crypto/tls.(Conn).clientHandshake(0xc0424ae380, 0x7b2d40, 0xc0424ae4a0)
C:/Go/src/crypto/tls/handshake_client.go:228 +0xf97
crypto/tls.(Conn).Handshake(0xc0424ae380, 0x0, 0x0)
C:/Go/src/crypto/tls/conn.go:1307 +0x1aa
net/http.(Transport).dialConn.func3(0x0, 0xc0424ae380, 0xc042feecc0, 0xc042f8d260)
C:/Go/src/net/http/transport.go:1082 +0x49
created by net/http.(*Transport).dialConn
C:/Go/src/net/http/transport.go:1087 +0xff3
Edit: I should also mention that I was running at ~200 calls / second at the time of failure
This looks like an exception being raised in the system function CertVerifyCertificateChainPolicy, not in Go code.
@bburket can you easily reproduce this? What versions of Windows do you run? Can you reproduce it on different Windows computer?
Thank you.
Alex
Not easily reproducible - but seems to happen somewhat frequently (I've seen it happen anywhere between 10 minutes and 4 hours of execution).
This is on Windows 10,. Microsoft Windows [Version 10.0.15063]
One interesting thing to note is that our corporate environment uses some hardware-based main-in-the-middle type SSL interception for all traffic (Palo Alto Networks hardware). Would not be surprised if this is related to that - but its speculation at this point.
However it would make sense that the lib should return some error up the stack, rather than panicking a background goroutine
One interesting thing to note is that our corporate environment uses some hardware-based main-in-the-middle type SSL interception for all traffic (Palo Alto Networks hardware). Would not be surprised if this is related to that - but its speculation at this point.
I was thinking about that possibility - that you might be running some non-standard software somewhere as part of your OS - as soon as I saw this report. Windows APIs don't raise exceptions.
However it would make sense that the lib should return some error up the stack, rather than panicking a background goroutine
Returning error messages is the norm. But you forget about possibility of a bug in the software you are running. Maybe you could ask your admin - perhaps there are some logs about what happened to you somewhere on the system. It is also worth asking for help from that lib developers, if it is at all possible.
You also did not say if you could reproduce the problem on different computer - computer that is not affected by your "SSL interceptor".
Alex
@alexbrainman Go’s TLS package should not cause an access violation, ever, unless there is a data race. That error message means that Go passed garbage to a Windows API function, which should not occur.
I've seen this issue raised somewhere in 2016 with no solution. The 3rd argument to the library function is a null pointer. If it is, it's not supposed to be since pPolicyPara is not optional.
syscall.CertVerifyCertificateChainPolicy(0x4, 0x328e9d0, 0xc042b9f3d0, 0xc042b9f3e0, 0x0, 0xc042e633d8)
BOOL WINAPI CertVerifyCertificateChainPolicy(
_In_ LPCSTR pszPolicyOID,
_In_ PCCERT_CHAIN_CONTEXT pChainContext,
_In_ PCERT_CHAIN_POLICY_PARA pPolicyPara,
_Inout_ PCERT_CHAIN_POLICY_STATUS pPolicyStatus
);
@alexbrainman Go’s TLS package should not cause an access violation, ever, unless there is a data race.
I am not sure what are you trying to say here. And how is that helpful?
Yes, data race could be the reason for this crash. But there might be other reasons. For example, it could be buggy software installed on @bburket computer.
That error message means that Go passed garbage to a Windows API function, which should not occur.
Again that could be one of many reasons, but it does not have to be the only one.
I've seen this issue raised somewhere in 2016 with no solution. The 3rd argument to the library function is a null pointer. If it is, it's not supposed to be since pPolicyPara is not optional.
syscall.CertVerifyCertificateChainPolicy(0x4, 0x328e9d0, 0xc042b9f3d0, 0xc042b9f3e0, 0x0, 0xc042e633d8)
Good suggestion @as. But I don't see pPolicyPara is set to 0 on this stack trace. I think pPolicyPara is 0xc042b9f3d0 (3rd parameter) - pszPolicyOID is 0x4, pChainContext is 0x328e9d0 and pPolicyPara is 0xc042b9f3d0. Am I wrong, @as?
Thank you.
Alex
@alexbrainman I think 0x4 may be the number of arguments to the call. The underlying function is syscall6 with four arguments. pszPolicyOID is Microsoft's hungarianized prefix notation (pointer to string zero terminated), so pointing to 0x4 sounds like an access violation. Here's was my train of thought on the whole thing, please let me know if I overlooked something.
//
// Call site
//
err = syscall.CertVerifyCertificateChainPolicy(syscall.CERT_CHAIN_POLICY_SSL, chainCtx, para, &status)
//
// Win32 API
//
BOOL WINAPI CertVerifyCertificateChainPolicy(
_In_ LPCSTR pszPolicyOID,
_In_ PCCERT_CHAIN_CONTEXT pChainContext,
_In_ PCERT_CHAIN_POLICY_PARA pPolicyPara,
_Inout_ PCERT_CHAIN_POLICY_STATUS pPolicyStatus
);
//
// Trace
//
syscall.CertVerifyCertificateChainPolicy(
0x4, // Number of arguments
0x328e9d0, // Address of proc
0xc042b9f3d0, // Arg1 (pszPolicyOID)
0xc042b9f3e0, // Arg2 (pChainContext)
0x0, // Arg3 (pPolicyPara)
0xc042e633d8 // Arg4 (pPolicyStatus)
)
//
// Mksyscall
//
C:\go\src\syscall\syscall_windows.go:/Policy/
//sys CertVerifyCertificateChainPolicy(policyOID uintptr, chain *CertChainContext,para *CertChainPolicyPara, status *CertChainPolicyStatus) (err error) = crypt32.CertVerifyCertificateChainPolicy
//
// Generated syscall6
//
func CertVerifyCertificateChainPolicy(policyOID uintptr, chain *CertChainContext, para *CertChainPolicyPara, status *CertChainPolicyStatus) (err error) {
r1, _, e1 := Syscall6(procCertVerifyCertificateChainPolicy.Addr(), 4, uintptr(policyOID), uintptr(unsafe.Pointer(chain)), uintptr(unsafe.Pointer(para)), uintptr(unsafe.Pointer(status)), 0, 0)
if r1 == 0 {
if e1 != 0 {
err = errnoErr(e1)
} else {
err = EINVAL
}
}
return
}
@as
I think 0x4 may be the number of arguments to the call. The underlying function is syscall6 with four arguments. pszPolicyOID is Microsoft's hungarianized prefix notation (pointer to string zero terminated), so pointing to 0x4 sounds like an access violation.
There are two function calls listed on the stack trace.
syscall.CertVerifyCertificateChainPolicy(0x4, 0x328e9d0, 0xc042b9f3d0, 0xc042b9f3e0, 0x0, 0xc042e633d8)
C:/Go/src/syscall/zsyscall_windows.go:1208 +0xc1
0x4 is pszPolicyOID (we pass syscall.CERT_CHAIN_POLICY_SSL to this parameter and syscall.CERT_CHAIN_POLICY_SSL is indeed equal to 4);
0x328e9d0 is pChainContext (looks fine, it is just a pointer);
0xc042b9f3d0 is pPolicyPara (is another pointer);
0xc042b9f3e0 is pPolicyStatus (one more pointer).
syscall.Syscall6(0x7ffb1ef3ca40, 0x4, 0x4, 0x328e9d0, 0xc042b9f3d0, 0xc042b9f3e0, 0x0, 0x0, 0xc04257cc80, 0x26c8f20, ...)
C:/Go/src/runtime/syscall_windows.go:174 +0x6b
0x7ffb1ef3ca40 is trap (it is pointer to CertVerifyCertificateChainPolicy);
0x4 is nargs (it is number of CertVerifyCertificateChainPolicy parameters - 4 looks correct to me);
0x4 is first parameter of call into CertVerifyCertificateChainPolicy (looks same as above);
0x328e9d0, 0xc042b9f3d0, 0xc042b9f3e0 are parameters 2, 3 and 4 into CertVerifyCertificateChainPolicy (similar to first parameter all look OK to me).
I still don't see any problem. Sorry.
Alex
@alexbrainman
No worries, I was clearly looking at the wrong line the entire time. My bad entirely!
I've been running a few Windows machines trying to reproduce this issue in the background. After around 200 hours Windows 7, 8, 2008R2, 2012R2 and 10 can't hit it dialing out to an internal web server.
I ship a golang binary to customers. One of my customers has received an identical panic:
Exception 0xc0000005 0x0 0xffffffffffffffff 0x7ff8f434c61f
PC=0x7ff8f434c61f
signal arrived during external code execution
syscall.Syscall6(0x7ff8f432ca40, 0x4, 0x4, 0x12f9ca0, 0xc0438473d0, 0xc0438473e0, 0x0, 0x0, 0xc043569ba0, 0x300000002, ...)
C:/go/src/runtime/syscall_windows.go:174 +0x6b
syscall.CertVerifyCertificateChainPolicy(0x4, 0x12f9ca0, 0xc0438473d0, 0xc0438473e0, 0x0, 0xc0427653d8)
C:/go/src/syscall/zsyscall_windows.go:1208 +0xc1
crypto/x509.checkChainSSLServerPolicy(0xc04274f680, 0x12f9ca0, 0xc043847808, 0x12a6bc0, 0xc042765548)
C:/go/src/crypto/x509/root_windows.go:117 +0xfd
crypto/x509.(*Certificate).systemVerify(0xc04274f680, 0xc042765808, 0x0, 0x0, 0x0, 0x0, 0x0)
C:/go/src/crypto/x509/root_windows.go:212 +0x484
crypto/x509.(*Certificate).Verify(0xc04274f680, 0xc043a1f650, 0x1e, 0xc0428df380, 0x0, 0xed133cf03, 0x2a606060, 0xee1100, 0x0, 0x0, ...)
C:/go/src/crypto/x509/verify.go:279 +0x86c
crypto/tls.(*clientHandshakeState).doFullHandshake(0xc043847e50, 0xc042009e80, 0x31)
C:/go/src/crypto/tls/handshake_client.go:300 +0x4c0
crypto/tls.(*Conn).clientHandshake(0xc042525180, 0xb755f8, 0xc0425252a0)
C:/go/src/crypto/tls/handshake_client.go:228 +0xf97
crypto/tls.(*Conn).Handshake(0xc042525180, 0x0, 0x0)
C:/go/src/crypto/tls/conn.go:1307 +0x1aa
net/http.(*Transport).dialConn.func3(0x0, 0xc042525180, 0xc042009b00, 0xc0437bdf80)
C:/go/src/net/http/transport.go:1082 +0x49
created by net/http.(*Transport).dialConn
C:/go/src/net/http/transport.go:1087 +0xff3
Running on windows (bitness unknown), binary compiled with go1.8.3.
Here is the full panic of my client application, including other goroutines (redacting only my internal pure-go no-unsafe application code). Some of them are in CGO syscalls, some in other net/http functions, some in other Windows IOCP routines.
I don't know if this is information is material, but I researched to find the "Exception 0xc0000005" issue also reported in some other threads. For instance, #9356 was found to be related to the linker. My binary was compiled with x86_64-w64-mingw32-gcc 5.4.0, CGO_ENABLED=1, and -ldflags "-s -w". It was not compiled with -linkmode external but i understand that is implicit for cgo on windows.
The client application had previously made a successful SSL connection to the same server during its same lifespan (but not necessarily with the same http.Client object).
The server it is connecting to is also written in Go, and is using a Let's Encrypt certificate (if that tells you anything about the intermediaries, algorithm etc). I do not know if any SSL interception proxy exists in the middle.
The stack trace i have is dated 2017-08-26. If it was caused by a third-party program on the machine (e.g. AV / firewall / web-safety program) then i suppose it might resolve itself if the third-party program is updated.
@aclements is it possible that compiler is getting smarter about keeping variables on the stack?
The syscall.CertVerifyCertificateChainPolicy call, I suspect, takes long time sometimes, because it needs to verify certificates and those could be stored on disk, and maybe even downloaded from another computer. Perhaps Go garbage collector moves stack of that gorouting while it waits in syscall.CertVerifyCertificateChainPolicy. The syscall.CertVerifyCertificateChainPolicy parameters 2, 3 and 4 are all pointers to Go memory. I run:
GOOS=windows go build -v -o /dev/null -gcflags '-m' crypto/x509
and I can see
./root_windows.go:100:105: checkChainSSLServerPolicy chainCtx does not escape
./root_windows.go:107:13: checkChainSSLServerPolicy &syscall.SSLExtraCertChainPolicyPara literal does not escape
./root_windows.go:112:18: checkChainSSLServerPolicy &syscall.CertChainPolicyPara literal does not escape
./root_windows.go:117:96: checkChainSSLServerPolicy &status does not escape
Does that mean any of syscall.CertVerifyCertificateChainPolicy parameters are kept on stack?
Alex
More identical issue reports and stack traces:
These eventually lead up to some Docker bug where the cause was confirmed as some kind of "WebCompanion" software running on the user's machine.
https://github.com/moby/moby/issues/27423#issuecomment-278196016
Does that mean any of syscall.CertVerifyCertificateChainPolicy parameters are kept on stack?
It does. In fact, the para structure is allocated right in checkChainSSLServerPolicy stack frame.
However, I think this is okay. syscall.CertVerifyCertificateChainPolicy itself could grow the stack, but everything still has type information at that point. Once that calls Syscall6, though, everything is nosplit down to the eventual cgocall, which does an entersyscall. Once a goroutine is in syscall state, we don't shrink its stack (for exactly this reason). As long as that syscall doesn't hold on to the pointers after it returns and doesn't call back in to Go code (does it?), it should be fine.
As long as that syscall doesn't hold on to the pointers after it returns and doesn't call back in to Go code (does it?), it should be fine.
No Windows itself does not call into Go.
Thank you for explaining.
Alex
I've run into this too (1.9.2 on Windows 10 amd64). Application was making a lot of HTTPS requests over the course of ~4-5 hours when the crash happened.
Stack trace below in case it helps.
Exception 0xc0000005 0x0 0xffffffffffffffff 0x7ffc211ecc9f
PC=0x7ffc211ecc9f
syscall.Syscall6(0x7ffc211ef940, 0x4, 0x4, 0x67ee280, 0xc042445380, 0xc042445390, 0x0, 0x0, 0x11dd091a53ec7a6f, 0x6a938809f593b9c8, ...)
C:/Go/src/runtime/syscall_windows.go:174 +0x69
syscall.CertVerifyCertificateChainPolicy(0x4, 0x67ee280, 0xc042445380, 0xc042445390, 0x0, 0xc0424b7388)
C:/Go/src/syscall/zsyscall_windows.go:1208 +0xb6
crypto/x509.checkChainSSLServerPolicy(0xc042851400, 0x67ee280, 0xc0424457f8, 0x6868960, 0xc0424b7528)
C:/Go/src/crypto/x509/root_windows.go:117 +0xf5
crypto/x509.(*Certificate).systemVerify(0xc042851400, 0xc0424b77f8, 0x0, 0x0, 0x0, 0x0, 0x0)
C:/Go/src/crypto/x509/root_windows.go:212 +0x4c8
crypto/x509.(*Certificate).Verify(0xc042851400, 0xc04248e5b0, 0x4, 0xc0428bd6e0, 0x0, 0xbe9a75a2b0089690, 0xe636f35a699, 0x995c60, 0x0, 0x0, ...)
C:/Go/src/crypto/x509/verify.go:289 +0x81f
crypto/tls.(*clientHandshakeState).doFullHandshake(0xc042445e68, 0xc042736bd0, 0x6a)
C:/Go/src/crypto/tls/handshake_client.go:300 +0x5bc
crypto/tls.(*Conn).clientHandshake(0xc04283fc00, 0x7d0d00, 0xc04283fd20)
C:/Go/src/crypto/tls/handshake_client.go:228 +0xfa2
crypto/tls.(*Conn).Handshake(0xc04283fc00, 0x0, 0x0)
C:/Go/src/crypto/tls/conn.go:1307 +0x196
net/http.(*Transport).dialConn.func3(0x0, 0xc04283fc00, 0x0, 0xc042b6a960)
C:/Go/src/net/http/transport.go:1151 +0x49
created by net/http.(*Transport).dialConn
C:/Go/src/net/http/transport.go:1147 +0xdc5
Experienced this again today on go1.9.2, same stack trace.
There's a few more folks at https://github.com/git-lfs/git-lfs/issues/1786 with the same problem.
@slrz commented on 9 Sep 2017:
These eventually lead up to some Docker bug where the cause was confirmed as some kind of "WebCompanion" software running on the user's machine.
moby/moby#27423 (comment)
No, I think that's different. That's the SetFileCompletionNotificationModes panic (a.k.a. #22149) which is fixed since go1.9.2, but, this problem is still occurring. (That also leads down a rabbithole with 0x20000 flags to identify your non-IFS LSPs/BSPs, but, I have seen this issue occur on a machine with no non-IFS LSPs/BSPs.) Is there any reason to think the problems are related?
func checkChainSSLServerPolicy(c *Certificate, chainCtx *syscall.CertChainContext, opts *VerifyOptions) error {
servernamep, err := syscall.UTF16PtrFromString(opts.DNSName)
if err != nil {
return err
}
sslPara := &syscall.SSLExtraCertChainPolicyPara{
AuthType: syscall.AUTHTYPE_SERVER,
ServerName: servernamep,
}
sslPara.Size = uint32(unsafe.Sizeof(*sslPara))
para := &syscall.CertChainPolicyPara{
ExtraPolicyPara: uintptr(unsafe.Pointer(sslPara)),
}
para.Size = uint32(unsafe.Sizeof(*para))
status := syscall.CertChainPolicyStatus{}
err = syscall.CertVerifyCertificateChainPolicy(syscall.CERT_CHAIN_POLICY_SSL, chainCtx, para, &status)
What if GC happened before syscall.CertVerifyCertificateChainPolicy is called? The stack might be moved, and uintptr(unsafe.Pointer(sslPara)) seems unsafe for me.
@zhangyoufu, I think you're on to something. Above, I'd only considered the direct arguments to syscall.CertVerifyCertificateChainPolicy, but I'd missed para.ExtraPolicyPara.
syscall.CertVerifyCertificateChainPolicy can move the stack on entry. If it does, we'll update the para argument since it's correctly typed as a pointer and points to the stack. But when scanning the para object in checkChainSSLServerPolicy's frame, we won't update the ExtraPolicyPara field because it's scalar-typed, even though it contains a stack "pointer" to the sslPara object. Then when we do the syscall and Windows tries to follow para.ExtraPolicyPara, it could wind up in memory that's been re-used for something else.
@alexbrainman, @mkrautz, there seem to be several uintptr-typed fields in syscall/types_windows.go that are actually pointers, particularly in the crypto APIs, but also in some other types. All of these could be susceptible to this sort of bug. Is there a reason these are uintptrs? We might have to change this in a non-backwards compatible way.
src/syscall/types_windows.go
https://github.com/golang/go/blob/49ed4cbe853e910b6f8d83012bc8b9afedb4b6b6/src/syscall/types_windows.go#L316-L320
https://github.com/golang/go/blob/49ed4cbe853e910b6f8d83012bc8b9afedb4b6b6/src/syscall/types_windows.go#L751-L756
https://github.com/golang/go/blob/49ed4cbe853e910b6f8d83012bc8b9afedb4b6b6/src/syscall/types_windows.go#L849-L855
https://github.com/golang/go/blob/49ed4cbe853e910b6f8d83012bc8b9afedb4b6b6/src/syscall/types_windows.go#L868-L876
https://github.com/golang/go/blob/49ed4cbe853e910b6f8d83012bc8b9afedb4b6b6/src/syscall/types_windows.go#L888-L896
https://github.com/golang/go/blob/49ed4cbe853e910b6f8d83012bc8b9afedb4b6b6/src/syscall/types_windows.go#L923-L927
https://github.com/golang/go/blob/49ed4cbe853e910b6f8d83012bc8b9afedb4b6b6/src/syscall/types_windows.go#L936-L942
https://github.com/golang/go/blob/49ed4cbe853e910b6f8d83012bc8b9afedb4b6b6/src/syscall/types_windows.go#L986-L995
What would happen if an argument of uintptr type can be both pointer/scalar ?
tokenInformation could be a pointer or scalar, depending on tokenInformationClassmsgsrc could be a pointer or scalarstoreProvider points to a null-terminated ANSI string. para could be a pointer or scalar@aclements
Some of them are (obviously) wrong. I believe the code was implemented before the GC moved stacks around, but that's not really an excuse. :-)
I think others might be uintptr simply because Go the stdlib does not use those fields (maybe they are optional) -- so the uintptr is simply there to pad the struct to fill a pointer-sized hole.
However, I suppose that it makes sense to just fix everything
Is there a reason these are uintptrs?
CertVerifyCertificateChainPolicy
https://msdn.microsoft.com/en-us/library/windows/desktop/aa377163(v=vs.85).aspx
third parameter is PCERT_CHAIN_POLICY_PARA
https://msdn.microsoft.com/en-us/library/windows/desktop/aa377187(v=vs.85).aspx
PCERT_CHAIN_POLICY_PARA.pvExtraPolicyPara is void*. The documentation says:
The address of a pszPolicyOID-specific structure that provides additional validity policy conditions.
What else it can be if not uintptr?
Also as @mkrautz said, this code was written long time ago. We did not have current GC back then. We did not have any rules about managing external memory then.
Alex
@zhangyoufu We're OK if a function argument can be either a pointer or a scalar. We can use the type uintptr safely for function arguments, because we apply special treatment to conversions of pointer to uintptr in a function call.
We're not OK if a struct field can be either a pointer or scalar. Are there any cases like that? If a field is always a pointer, we need to change the type to be a pointer type, perhaps unsafe.Pointer if there is no more suitable type. I think the only way to avoid that would be to introduce some new syscall-only magic comment for "field is really a pointer." But that would be awful so lets avoid it if we can.
There seem to be three classes of uintptrs in these types that ought to be pointers:
For 1 and 2, we could make them unsafe.Pointer, but putting unsafe.Pointer into an API lets users of the package perform some unsafe conversions without importing unsafe (that may have been a mistake in the language semantics). I'd be inclined to declare something like type Pointer *struct{} in the syscall package and use that for 1 and 2. That still requires the caller to explicitly import unsafe and do obviously unsafe pointer casts (the API is unsafe, so this is the best we can do), but it keeps the field pointer-typed for the garbage collector and allows callers to obey the unsafe.Pointer rules when using the API.
I believe we should fix this. I also believe we can fix it because the syscall package is explicitly opted out of the Go 1 compatibility promise and because changing these types would fix a class of GC bugs that can't otherwise be worked around.
What about fields that may or may not be pointers, depending on the value
of some other field?
On Tue, Apr 10, 2018, 12:26 PM Austin Clements notifications@github.com
wrote:
There seem to be three classes of uintptrs in these types that ought to
be pointers:
- It points to a particular type, but we simply haven't implemented
that type in Go.- There are multiple types it can point to depending on some other
field. This is generally an LPVOID in the C API.- It could be a pointer to a Go type, but isn't (for unclear reasons).
For 1 and 2, we could make them unsafe.Pointer, but putting unsafe.Pointer
into an API lets users of the package perform some unsafe conversions
without importing unsafe (that may have been a mistake in the language
semantics). I'd be inclined to declare something like type Pointer
*struct{} in the syscall package and use that for 1 and 2. That still
requires the caller to explicitly import unsafe and do obviously unsafe
pointer casts (the API is unsafe, so this is the best we can do), but it
keeps the field pointer-typed for the garbage collector and allows callers
to obey the unsafe.Pointer rules when using the API.I believe we should fix this. I also believe we can fix it because the
syscall package is explicitly opted out of the Go 1 compatibility promise
and because changing these types would fix a class of GC bugs that can't
otherwise be worked around.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/golang/go/issues/21376#issuecomment-380162886, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AGGWB55ROAeaojv1Ioi4gGgzI_MDvRN_ks5tnN0wgaJpZM4OyXbU
.
What about fields that may or may not be pointers, depending on the value of some other field?
That would be a problem, but as far as I can tell it doesn't apply to any of the uintptr fields above.
Change https://golang.org/cl/106275 mentions this issue: syscall: introduce Pointer type and use it instead of uintptr
We're not OK if a struct field can be either a pointer or scalar. Are there any cases like that?
I don't see any in syscall and internal/syscall/windows/*
I'd be inclined to declare something like type Pointer *struct{} in the syscall package and use that for 1 and 2.
I hope I interpreted your idea correctly https://go-review.googlesource.com/#/c/go/+/106275
I believe we should fix this. I also believe we can fix it because the syscall package is explicitly opted out of the Go 1 compatibility promise and because changing these types would fix a class of GC bugs that can't otherwise be worked around.
I am fine fixing syscall package. But we could also copy and adjust syscall code somewhere else - these would leave syscall users with broken code.
Whatever we decide, we should also fix golang.org/x/sys/windows package.
What about fields that may or may not be pointers, depending on the value of some other field?
That would be a problem, but as far as I can tell it doesn't apply to any of the
uintptrfields above.
That is what I see too.
Alex
Is this point-release material? From a glance, it seems any Windows Go program using TLS is vulnerable to rare but unavoidable crashes.
@eliasnaur, yes, we should fix this in a point release. Thanks for the reminder.
We can't port CL 106275 to a point release because it breaks a public API, but I can think of a few ways to fix this without breaking the (admittedly broken) API:
syscall.CertChainPolicyPara into crypto/x509 with the corrected field type, use that in checkChainSSLServerPolicy, and unsafe cast it for the call to syscall.CertVerifyCertificateChainPolicy.syscall.CertChainPolicyPara to the heap and runtime.KeepAlive it across the syscall.internal/syscall/windows and modify crypto/x509 to use that.I'd be inclined to go with solution 1.
Solution 3 is sort of intriguing in general. While we're breaking things, we could even move all of the crypto APIs from syscall to internal/syscall/windows and tell people to use x/sys/windows instead (which we also need to fix).
Option 4: fix the broken API, but use compiler magic to hide the change
from user programs.
On Wed, Apr 18, 2018, 12:23 PM Austin Clements notifications@github.com
wrote:
@eliasnaur https://github.com/eliasnaur, yes, we should fix this in a
point release. Thanks for the reminder.We can't port CL 106275 to a point release because it breaks a public API,
but I can think of a few ways to fix this without breaking the (admittedly
broken) API:
- Copy the definition of syscall.CertChainPolicyPara into crypto/x509
with the corrected field type, use that in checkChainSSLServerPolicy,
and unsafe cast it for the call to
syscall.CertVerifyCertificateChainPolicy.- Force the syscall.CertChainPolicyPara to the heap and
runtime.KeepAlive it across the syscall.- Copy the corrected APIs into internal/syscall/windows and modify
crypto/x509 to use that.I'd be inclined to go with solution 1.
Solution 3 is sort of intriguing in general. While we're breaking things,
we could even move all of the crypto APIs from syscall to
internal/syscall/windows and tell people to use x/sys/windows instead
(which we also need to fix).—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/golang/go/issues/21376#issuecomment-382445433, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AGGWB6YylVDOnjy0W9WafUr7VK0_G1MOks5tp2hegaJpZM4OyXbU
.
@gopherbot please file this for backport.
See @aclements's comment https://github.com/golang/go/issues/21376#issuecomment-382445433 about having to develop a different fix from CL 106275 in order not to break a public API.
Backport issue(s) opened: #25033 (for 1.10), #25034 (for 1.9).
Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.
Change https://golang.org/cl/111715 mentions this issue: [release-branch.go1.10] crypto/tls: copy and use adjusted syscall.CertChainPolicyPara
Change https://golang.org/cl/112095 mentions this issue: [release-branch.go1.9] crypto/x509: copy and use adjusted syscall.CertChainPolicyPara
Change https://golang.org/cl/112179 mentions this issue: [release-branch.go1.9] crypto/x509: copy and use adjusted syscall.CertChainPolicyPara
While we're breaking things <...>
@aclements unfortunately yes. As time permits, can you please take a look at https://github.com/google/certificate-transparency-go/issues/284 issue (and the proposed fix at https://github.com/google/certificate-transparency-go/pull/285) and let us know if this could maybe solved in some better way?
Most helpful comment
Is this point-release material? From a glance, it seems any Windows Go program using TLS is vulnerable to rare but unavoidable crashes.