go 🚀 - proposal: runtime/debug: allow setting fd for crash output

As a data point, the gVisor project wants to redirect panic/throw output. It works around this problem by dup'ing over stderr with the FD of the desired crash output location (https://cs.opensource.google/gvisor/gvisor/+/master:runsc/cli/main.go;l=194-198?q=dup3&ss=gvisor%2Fgvisor). stderr is otherwise unused by the program, so this works OK. This is a workable solution, but it would certainly be nicer to be able to change the destination directly.

prattmic on 30 Nov 2020

Thinking outloud, the runtime prints other output besides just crashes, such as what one can enable via GODEBUG. Should that be included in this option?

mvdan on 30 Nov 2020

I'd think (initially, at least), yes. I'd probably call this something like SetRuntimeFD for any runtime-generated output.

On the other hand, there are nice opportunities if a separate "fatal" FD could be provided that only includes fatal panic/throw output, plus any related prints immediately before a throw. That would allow a nice simplification of "any bytes received on this FD indicates a crash". But it means extra complexity for a rather edge case [1].

[1] I can't think of any non-fatal runtime output that isn't behind a GODEBUG option.

prattmic on 30 Nov 2020

I would say, yes, to including other runtime output as well. If there's a need for additional separation it can be always introduced later. But, I wouldn't be also opposed to introducing it immediately either.

I think the fatal output can be detected when the fd write closes due to the program stopping.

egonelbre on 30 Nov 2020

@egonelbre how do you propose to use your API on Windows?

I agree. I wanted this feature to capture crash dumps of a Windows service.

/cc @zx2c4

Alex

alexbrainman on 1 Dec 2020

I've handled this until now with defer/recover in each go routine, which is really very subpar and doesn't handle all crashes.

Rather than setting an output fd, what about setting an output callback? There'd be certain restrictions on what it could do, obviously, but it could still probably accomplish most things that people want. In my case, it'd just copy the bytes to some mmap'd ring buffer log file.

zx2c4 on 1 Dec 2020

@alexbrainman it would behave in a similar manner that you can provide a fd that outputs to a file or a pipe. Flushing it to Event Log would require some external "crash reporting" service that reads that file or pipe. This isn't ideal, but would allow better handling than what can be currently done. Ideal scenario would be to write directly to event-log, however that would probably require specialized machinery.

@zx2c4 I believe the callback has been proposed a few times, however I cannot find the exact proposals. I think the fundamental issue is that you don't know really much about the crashed system - what's broken and what's not. e.g. maybe you got a panic during sigprof where some code holds a runtime lock "x", which your code needs and you'll deadlock when calling file.Write. This is all hypothetical of course.

egonelbre on 1 Dec 2020

what's not. e.g. maybe you got a panic during sigprof where some code holds a runtime lock "x", which your code needs and you'll deadlock when calling file.Write. This is all hypothetical of course.

Right. If you're using debug.SetCrashCallback, then you are signing up for that kind of fun, and it's up to to write Go code that calls the minimal set of things needed. If the runtime can already do that for spitting to a fd, so can a custom handler function.

Ideal scenario would be to write directly to event-log, however that would probably require specialized machinery.

Writing to event log isn't too bad. It'd support the callback-based approach I mentioned.
(In my case, log writes look like this: https://git.zx2c4.com/wireguard-windows/tree/ringlogger/ringlogger.go#n105 )

zx2c4 on 1 Dec 2020

Right. If you're using debug.SetCrashCallback, then you are signing up for that kind of fun, and it's up to to write Go code that calls the minimal set of things needed. If the runtime can already do that for spitting to a fd, so can a custom handler function.

This means no heap allocation, no stack splits, no map access (IIRC), particularly if you want this to cover runtime throws. Not impossible, but a pretty high bar to set for an API.

prattmic on 1 Dec 2020

This means no heap allocation, no stack splits, no map access (IIRC), particularly if you want this to cover runtime throws. Not impossible, but a pretty high bar to set for an API.

But also not _so_ out of place for something in debug, right? We're offering some way to hook into runtime internals, and with that comes the responsibilities of being in the runtime. But maybe there's a better idea here:

The other way of doing this might be to have an unstable and unexported function runtime.setCrashHook, which can then be used by wrapper packages like:

//go:linkname setCrashHook runtime.setCrashHook

That then represents the lowest level. At a higher level, then the debug package can use that to implement things like debug.SetCrashFD and the Windows service package can implement things like svc.SetCrashEventlog. And then most users use these normal functions that our libraries can provide.

Meanwhile, insane users like me can dip down into runtime.setCrashHook, along with all of its dangers, for hooking it up my custom ringbuffer mmap'd logger thing, knowing full well that if anything goes wrong, this is unsupported and my fault, etc.

zx2c4 on 1 Dec 2020

The restrictions on an output callback would be severe, as @prattmic says, and failure modes would be unpredictable. In Go we try to avoid that kind of deeply unsafe operation.

I think it would be better to keep this issue focused on the much safer operation of setting a file descriptor.

That said, I don't understand what would happen if the descriptor is a pipe that nothing is reading from. Or a file opened on a networked file system that is not responding. What should do in cases like that?

ianlancetaylor on 2 Dec 2020

The restrictions on an output callback would be severe, as @prattmic says, and failure modes would be unpredictable. In Go we try to avoid that kind of deeply unsafe operation.

What do you think of my proposal above of allowing this to be a general callback via go:linkname -- i.e. just for libraries? That's a mere "implementation detail", but it'd go a long way of enabling this to be extensible for the adventurous.

zx2c4 on 2 Dec 2020

That said, I don't understand what would happen if the descriptor is a pipe that nothing is reading from. Or a file opened on a networked file system that is not responding. What should do in cases like that?

I'm not sure we need to worry about those (beyond documentation, perhaps). stderr could already be any of those kinds of descriptors, as set by the parent process, so the same problems would exist today.

prattmic on 2 Dec 2020

That said, I don't understand what would happen if the descriptor is a pipe that nothing is reading from. Or a file opened on a networked file system that is not responding. What should do in cases like that?

I guess this is the main danger with the proposal. As @prattmic mentioned, somebody could pipe stderr to somewhere that isn't being read. I didn't test it, but I think that would block in the same way.

Using a non-blocking write that drops data when the fd is not responding would be nice, however I suspect that would be difficult to implement.

It'll definitely need examples how to write the listening side.

Other than that, I don't have any further ideas.

egonelbre on 2 Dec 2020

This might be still difficult due to the constraints, but if the writing is on a different thread and the previous write hasn't finished in appropriate time, the write can be dropped. The threshold could be configured via GODEBUG or similar.

egonelbre on 2 Dec 2020

Should we just have GODEBUG=debugfd=3?

/cc @aclements

rsc on 2 Dec 2020

😕1

@zx2c4 I'm not fond of documenting a callback to use with go:linkname either. I tend to think that people who operate at that level can customize their Go distribution.

ianlancetaylor on 2 Dec 2020

I'd be okay with GODEBUG=debugfd=3. It's not straightforward to implement, though, since we don't distinguish between runtime output and anything else printed by print(ln). It would actually be much easier to implement just for tracebacks because we could use something like the gp.writebuf redirection, just to an FD instead. But maybe that's better anyway?

I agree with @alexbrainman that I'm not sure how this would work on Windows. Internally, we treat any "fd" other than 1 or 2 as a raw Windows handle value, but I don't know if such a thing can be meaningfully passed as an environment variable. Is it possible for a parent process to create a handle and pass it to a child like this?

On the topic of a callback, in addition to the subtle and rather onerous restrictions on the callback, those restrictions can also change with new releases. We don't expose anything remotely like that right now.

aclements on 2 Dec 2020

I did think about GODEBUG, but I suspect that would be complicated to use in Windows services since you don't have a nice way to figure out the fd prior to starting it. The main process needs to respond to messages (https://github.com/golang/sys/blob/ef89a241ccb347aa16709cf015e91af5a08ddd33/windows/svc/example/service.go#L23).

debug.SetCrashFD allows the main process to start a "monitor process" and set the fd accordingly, while the main process can handle the "service" responsibilities.

egonelbre on 2 Dec 2020

@alexbrainman it would behave in a similar manner that you can provide a fd that outputs to a file or a pipe. Flushing it to Event Log would require some external "crash reporting" service that reads that file or pipe. This isn't ideal, but would allow better handling than what can be currently done. Ideal scenario would be to write directly to event-log, however that would probably require specialized machinery.

@egonelbre I don't understand what you are suggesting.

If I have a Windows service written in Go, how can I redirect its crash dump to a file by using debug.SetCrashOutputFD(3) ? Let's say I want crash dump file to be called c:\a.txt.

Similarly, I don't see how using debug.SetCrashOutputFD(3) would help me write to Event Log.

Maybe we can use file path instead of file descriptor? Like debug.SetCrashOutput("/tmp/a.txt").

Alex

alexbrainman on 2 Dec 2020

@alexbrainman , for writing to a file, I imagine you would open the file and then pass that handle to debug.SetCrashOutputFD (or whatever). I don't know how the event log works, but I imagine you'd start a second process with a pipe between them and the second process would be responsible for writing to the event log.

debug.SetCrashFD allows the main process to start a "monitor process" and set the fd accordingly, while the main process can handle the "service" responsibilities.

Is there a reason this can't be done the other way around? The first process is the monitor process and opens the FD, then starts the second process with the GODEBUG environment set. Does this not work for Windows services?

(I'm not necessarily opposed to having a runtime/debug function for this, just trying to understand the space of constraints. One nice thing about the environment variable is that it can catch very early panics, like during package init.)

aclements on 2 Dec 2020

@zx2c4 I'm not fond of documenting a callback to use with go:linkname either. I tend to think that people who operate at that level can customize their Go distribution.

Yea, _documenting_ it is probably the wrong thing to do. But having something in there akin to nanotime would be "very nice"...and would allow me to implement something for Go's Windows service package that uses it to ship things to eventlog automatically.

Anyway, I realize pushing this too hard is an loosing battle, for good reason. But if somehow the implementation _happened_ to be structured in a way that made it _coincidentally possible_ to go:linkname it from elsewhere, I would be very happy.

zx2c4 on 2 Dec 2020

😄1

Is it possible for a parent process to create a handle and pass it to a child like this?

You can make a handle inheritable, and then pass its value as an argument to the new process. The value will remain the same in the new process. Alternatively, DuplicateHandleEx allows the one process to duplicate a handle _into_ another process.

zx2c4 on 2 Dec 2020

@alexbrainman, so roughly what I'm thinking is the following. However, I might also forget some internal details that could make the following difficult with Windows.

The most basic thing is to send the output to a separate file:

package main

import (
    "os"
    "runtime/debug"
)

func main() {
    f, err := os.Open("service_log.txt")
    if err != nil {
        os.Exit(1)
        return
    }

    debug.SetCrashFD(f.Fd())

    // rest of the service logic
}

The next step would be to create a watchdog sub-process:

package main

import (
    "os"
    "os/exec"
    "runtime/debug"
)

func main() {
    r, w, err := os.Pipe()
    if err != nil {
        os.Exit(1)
        return
    }

    // this could also self-start, but in a different mode
    // but we'll use a separate binary for now
    cmd := exec.Command("watchdog.exe") 
    cmd.Stdin = r
    // also set other flags here to ensure that the watchdog
    // doesn't immediately close together with the parent.
    cmd.Start()

    debug.SetCrashFD(w.Fd())

    // rest of the service logic
}

You could also create governors using ExtraFiles:

package main

import (
    "os"
    "os/exec"
    "runtime/debug"
)

func main() {
    r, w, err := os.Pipe()
    if err != nil {
        os.Exit(1)
        return
    }

    cmd := exec.Command("watchdog.exe") 
    cmd.ExtraFiles = []*os.File{w}
    cmd.Start()

    // monitor `r`
}

// The other process has `debug.SetCrashFD(3)` in init,
// with some check that `3` actually exists.

For windows, I think (named) pipes could also be used:

f, err := os.Open(`\\.\pipe\randomhexvalue`)
...
debug.SetCrashFD(f.Fd())

egonelbre on 2 Dec 2020

👍1

The first process is the monitor process and opens the FD, then starts the second process with the GODEBUG environment set. Does this not work for Windows services?

Yes, it could work. The main drawback is that windows services need to respond to control messages. https://pkg.go.dev/golang.org/x/sys/windows/svc#Handler

This would mean that the governor process would need to delegate all those messages to the subprocess, which would introduce an additional complication. Although, there might be a way to make subprocess handle the service messages directly, I haven't seen it.

egonelbre on 2 Dec 2020

👍1

Avoid using named pipes if you can; they're a security landmine. Instead pass a handle to a child process, as in your other example.

However, I still suspect there's a way that we can log things directly to the eventlog even during a panic, without the need to launch a _persistent_ watchdog process. That's what I was proposing doing with go:linkname above.

But if you're really after something maximally robust, a better bet is to share a file mapping -- CreateFileMapping/MapViewOfFile -- and then have the error handle simply write bytes to an address, and voila, they'll be logged. And if you have a watchdog process monitoring the parent process (via WaitForSingleObject(OpenProcess(...)) or via NotifyServiceStatusChange), it can check this shared memory region after termination.

Alternatively, if you want to forgo the child process entirely, you can actually have a crash reporter process spin up via service triggers if the crashing process is able to raise an ETW event. Check out SERVICE_TRIGGER_TYPE_CUSTOM. That's a bit clumsy though, and requires you to be able to successfully call the ETW api from a crashing process.

Perhaps the more robust way to go about this would be to use the built-in Windows Error Reporting functions. Your program could use WerRegisterMemoryBlock or WerRegisterFile to set up the Go panic output destination, and then WER could collect the info there. Then, Go's panic handler would write to that file, or more reliably, write into that registered memory block. You can then pick this up with the normal default WER crash dump files, or better, have your application use WerRegisterRuntimeExceptionModule prior, which will then load a custom DLL of your choosing to then do something useful with the memory block (containing the Go panic output) registered prior.

In other words, if what you're after with this proposal is more robust Windows crash dumps, I think there's actually a lot of room for improvement all over.

zx2c4 on 2 Dec 2020

Avoid using named pipes if you can; they're a security landmine. Instead pass a handle to a child process, as in your other example.

Fair enough.

In other words, if what you're after with this proposal is more robust Windows crash dumps, I think there's actually a lot of room for improvement all over.

It was meant as a more general purpose, something that would be useful on unix as well, not just windows.

It would improve the windows service situation. And, I do agree that avoiding subprocess and using the OS api would improve it even more. But, it seems that would be a different proposal.

Although, now thinking of it... maybe, a windows service could use os.Open() set the crash fd and use WerRegisterFile on it. That way it would end up getting the info.

egonelbre on 2 Dec 2020

Although, now thinking of it... maybe, a windows service could use os.Open() set the crash fd and use WerRegisterFile on it. That way it would end up getting the info.

(Assuming that the Go process is in a position to jump to WriteFile to actually write to the handle. WerRegisterMemoryBlock or similar is probably the more robust route.)

zx2c4 on 2 Dec 2020

I suppose a program can always do this by re-execing itself with standard error mapped to some descriptor. Or you can use a wrapper program to do this.

Is it ever useful to do this other than when starting the program?

ianlancetaylor on 3 Dec 2020

Oh, you know if all you're trying to do is dup2 over stderr on Windows, you can already hack the panic and println output fd like this:

    f, _ := os.Create(`C:\helloworld.txt`)
    windows.SetStdHandle(^uint32(11), windows.Handle(f.Fd()))
    panic("oh nose!")

That works reasonably well, even inside of services:

Changing the fd isn't particularly difficult, as you can see there, which is why I keep poking at, "what about running custom logger functions?", which is the much more interesting goal (to me, at least).

zx2c4 on 3 Dec 2020

👀1

Changing the fd isn't particularly difficult...

The issue is not just about changing the fd where to output, but also cleanly separating crash&debug output from both stdout and stderr.

egonelbre on 3 Dec 2020

Changing the fd isn't particularly difficult...

The issue is not just about changing the fd where to output, but also cleanly separating crash&debug output from both stdout and stderr.

In the context of a windows service, why do you care about stdout and stderr? They're essentially not really there. Or are you not actually operating within a windows service?

zx2c4 on 3 Dec 2020

In the context of a windows service, why do you care about stdout and stderr? They're essentially not really there. Or are you not actually operating within a windows service?

It's about regular linux programs, linux services/servers, windows programs and windows services/servers. Essentially, the problem is not restricted to windows service. See https://github.com/golang/go/issues/42888#issuecomment-736061869 as a good example for non-windows-service.

egonelbre on 3 Dec 2020

I just committed this for experimentation for my own stuff: https://git.zx2c4.com/wireguard-windows/commit/?id=6753ac7de518ee2ad58e6a2cd9f367cbf2e34ad6

The general idea is that I added a little hook to time_nofake.go:

var overrideWrite func(fd uintptr, p unsafe.Pointer, n int32) int32

// write must be nosplit on Windows (see write1)
//
//go:nosplit
func write(fd uintptr, p unsafe.Pointer, n int32) int32 {
        if overrideWrite != nil {
                return overrideWrite(fd, noescape(p), n)
        }
        return write1(fd, p, n)
}

Then, in my ringlogger global init function, I set that function pointer using go:linkname:

var Global *Ringlogger

//go:linkname overrideWrite runtime.overrideWrite
var overrideWrite func(fd uintptr, p unsafe.Pointer, n int32) int32

func InitGlobalLogger(tag string) error {
    if Global != nil {
        return nil
    }
    root, err := conf.RootDirectory(true)
    if err != nil {
        return err
    }
    Global, err = NewRinglogger(filepath.Join(root, "log.bin"), tag)
    if err != nil {
        return err
    }
    log.SetOutput(Global)
    log.SetFlags(0)
    overrideWrite = globalWrite
    return nil
}

Finally, that points to globalWrite, which is a super clunky fixed buffer buffering mechanism:

var globalBuffer [maxLogLineLength - 1 - maxTagLength - 3]byte
var globalBufferLocation int

//go:nosplit
func globalWrite(fd uintptr, p unsafe.Pointer, n int32) int32 {
    b := (*[1 << 30]byte)(p)[:n]
    for len(b) > 0 {
        amountAvailable := len(globalBuffer) - globalBufferLocation
        amountToCopy := len(b)
        if amountToCopy > amountAvailable {
            amountToCopy = amountAvailable
        }
        copy(globalBuffer[globalBufferLocation:], b[:amountToCopy])
        b = b[amountToCopy:]
        globalBufferLocation += amountToCopy
        foundNl := false
        for i := globalBufferLocation - amountToCopy; i < globalBufferLocation; i++ {
            if globalBuffer[i] == '\n' {
                foundNl = true
                break
            }
        }
        if foundNl || len(b) > 0 {
            Global.Write(globalBuffer[:globalBufferLocation])
            globalBufferLocation = 0
        }
    }
    return n
}

The invocation there of Global.Write writes into the mmap'd ringbuffer without allocations.

So far in testing, this scheme works well. It's not pretty, but it does work.

zx2c4 on 3 Dec 2020

As a data point, the gVisor project wants to redirect panic/throw output. It works around this problem by dup'ing over stderr with the FD of the desired crash output location (https://cs.opensource.google/gvisor/gvisor/+/master:runsc/cli/main.go;l=194-198?q=dup3&ss=gvisor%2Fgvisor). stderr is otherwise unused by the program, so this works OK. This is a workable solution, but it would certainly be nicer to be able to change the destination directly.

cc @steeve coz of https://github.com/znly/go/commit/6af49debb0b604c1bfe209d5801f985afb327c72

komuw on 4 Dec 2020

As a data point, the gVisor project wants to redirect panic/throw output. It works around this problem by dup'ing over stderr with the FD of the desired crash output location (https://cs.opensource.google/gvisor/gvisor/+/master:runsc/cli/main.go;l=194-198?q=dup3&ss=gvisor%2Fgvisor). stderr is otherwise unused by the program, so this works OK. This is a workable solution, but it would certainly be nicer to be able to change the destination directly.

cc @steeve coz of znly@6af49de

Hooking the panic itself like that is a lot cleaner than my trick of hooking the write1 function, and probably allows for capturing more info (such as logging for all go routines).

zx2c4 on 4 Dec 2020

@alexbrainman , for writing to a file, I imagine you would open the file and then pass that handle to debug.SetCrashOutputFD (or whatever).

Windows uses syscall.Handle to wrote to file, not file descriptors. syscall.Handle values are meaningless outside of their process. For example, stderr syscall.Handle is not 3.

I don't know how the event log works,

See https://godoc.org/golang.org/x/sys/windows/svc/eventlog#Log for details. It uses ReportEvent Windows API.

I am OK if Event Log is not supported for crush dumps. If I could get my crash dumps in a file of my choice, that would be good enough for me.

but I imagine you'd start a second process with a pipe between them and the second process would be responsible for writing to the event log.

True.

But normal services are not structured this way. So you would need to change your program. And, if you program is crashing now, you just want to see the crash rather than start restructuring it into 2 processes.

And one process is simpler to program, then 2 processes communicating with each other. Like @zx2c4 said Windows service process (the first process that Windows runs when it starts your service) receives events from Windows, and it needs to reply to these events. So the processes would have to communicate with each other. And I agree with @zx2c4 that using pipes between processes can be tricky - it needs to be 2 way communication because of events received.

And the "monitor" process can still crash. How do you deal with that?

@egonelbre

The most basic thing is to send the output to a separate file:

Yes, writing to a file sounds fine. But then we should just use debug.SetCrashOutput("/tmp/a.txt").

The next step would be to create a watchdog sub-process:

See my objections above about converting single process into 2 processes.

You could also create governors using ExtraFiles:

Does ExtraFiles work on Windows?

// The other process has debug.SetCrashFD(3) in init,
// with some check that 3 actually exists.

I don't think 3 as syscall.Handle value will work. Do you think you can pass 3 to WriteFile Windows API?

For windows, I think (named) pipes could also be used:

Perhaps. Again. For this to work properly, someone needs to be reading from that pipe. Sounds too complicated for me.

Oh, you know if all you're trying to do is dup2 over stderr on Windows, you can already hack the panic and println output fd like this:

f, _ := os.Create(C:\helloworld.txt)
windows.SetStdHandle(^uint32(11), windows.Handle(f.Fd()))
panic("oh nose!")
That works reasonably well, even inside of services:

This is nice. Looks simple enough. Hopefully I will remember this trick next time I debug crashing service.

Alex

alexbrainman on 5 Dec 2020

Does ExtraFiles work on Windows?

It does not.

I don't think 3 as syscall.Handle value will work. Do you think you can pass 3 to WriteFile Windows API?

Yes, passing handle doesn't work. The debug.SetCrashFD was an example on platforms that support ExtraFiles and passing fd.

And one process is simpler to program, then 2 processes communicating with each other.

Yes, I agree.

All in all, I do agree that this proposal is not sufficient for making good crash handling for Windows Services.

With regards to using debug.SetCrashOutput("/tmp/a.txt"), this would mean that other platforms that do support ExtraFiles wouldn't be able to use a fd. (See gvisor example.)

egonelbre on 5 Dec 2020

Does ExtraFiles work on Windows?

It does not.

I wonder why not. StartProcess can pass Files just fine. Do you know what's different about ExtraFiles? Seems like that's a bug we should fix.

zx2c4 on 5 Dec 2020

👍1

@zx2c4 there's some more information in https://github.com/golang/go/issues/21085.

egonelbre on 5 Dec 2020

All in all, I do agree that this proposal is not sufficient for making good crash handling for Windows Services.

debug.SetCrashFD might be OK as an API. But I worry that Linux developers will write debug.SetCrashFD(3) and this will compile on Windows, and Windows developer would not know why this function does not work.

With regards to using debug.SetCrashOutput("/tmp/a.txt"), this would mean that other platforms that do support ExtraFiles wouldn't be able to use a fd. (See gvisor example.)

Fair enough.

Alex

alexbrainman on 6 Dec 2020

OK, well to circle back a bit, it sounds like the environment variable is not a great idea, at least not by itself, and that we should be thinking about debug.SetRuntimeFD(uintptr), which will accommodate a Unix fd or a Windows handle.

I believe this would apply to all runtime-generated prints (including things like GC traces) but not to user-generated ones (print and println).

And then perhaps on top of the SetRuntimeFD we might also want to allow GODEBUG=fd=N but that would probably only be useful on Unix.

Do I have that right?

rsc on 9 Dec 2020

Just noting that if we separate runtime prints from application prints we'll have to decide how to handle runtime.printBacklog. It currently gets both runtime prints and application prints, but maybe getting just runtime prints is better anyhow.

ianlancetaylor on 9 Dec 2020

Do I have that right?

From my point of view yes.

Also noting that Windows Services will need a different design to handle crashes, but that would be a different proposal.

egonelbre on 10 Dec 2020

Do I have that right?

From my point of view yes.

Also noting that Windows Services will need a different design to handle crashes, but that would be a different proposal.

Why do we require a different design for Windows Services?

I've proposed above several such things that work in a wide variety of cases, including Windows Services.

zx2c4 on 10 Dec 2020

Why do we require a different design for Windows Services?

I've proposed above several such things that work in a wide variety of cases, including Windows Services.

I'll try to summarize the discussions, but let me know if I missed some option/opinion:

A) Setting a callback directly, debug.SetPanicCallback(func() { ... })

Given the restrictions that the func needs to adhere to it's a really unsafe operation that's usually not exposed. (https://github.com/golang/go/issues/42888#issuecomment-736625441, https://github.com/golang/go/issues/42888#issuecomment-737338402)

B) Having a standard callback via linkname that can be used to hook into.

While it hides the unsafe API, however it's still something not usable for most users. I agree that writing such func is possible, however, but it's more likely people will get the usage wrong. (https://github.com/golang/go/issues/42888#issuecomment-737437206)

C) Changing std fd using windows.SetStdHandle

This doesn't separate runtime printing from user prints to stderr. It would be probably possible to create os.Stderr that is different from handle 2, basically making it difficult to use. I'm not sure what other restrictions this approach might have, e.g. maybe UWP?

Just to clarify, I agree that callback is more powerful and solves the problems more widely, but due to unsafe nature, it seems a dangerous to expose it for common use. So, I'm not opposed to the idea, but it seems like a different proposal on how to exactly do that.

egonelbre on 10 Dec 2020

@egonelbre, your list does not include the option I mentioned above, namely debug.SetRuntimeFD(uintptr). That _would_ separate user prints from the Go-runtime-originated stderr, which it sounds like is what Windows Services needs. So I think debug.SetRuntimeFD _does_ work for the Windows Services case.

Am I missing something? We probably don't want to have to revisit this to do something different for Windows.

rsc on 16 Dec 2020

@rsc, I think @alexbrainman's comment above explained why debug.SetRuntimeFD is poorly suited to Windows services. My takeaway was that, while it's not impossible, Windows expects a particular process structure for services that makes having a second process just for funneling the FD into the event log awkward.

I'm somewhat loathe to propose this, but: on Android, we have logic in the runtime that sends all prints to the syslog because it has a similar issue where stderr is just thrown away. I hate how complex that logic is for such a fundamental operation, but maybe it would make sense to do something similar for Windows services and have their writes go straight to the event log? IMO, in both cases this should be the OS's job, but it's not doing it.

aclements on 16 Dec 2020

... your list does not include the option I mentioned above ...

@rsc the list was a response to @zx2c4 and why his proposals didn't seem feasible to me at the moment, and not the list of all proposals in the discussion.

So I think debug.SetRuntimeFD does work for the Windows Services case.

Yes, I agree that it improves the situation. Being able to redirect the crash to a file is a step up.

I mentioned Windows Service solution as insufficient, because the conventional approach is to write to EventLog. Sometimes it's also very useful to include other metadata about the system state during the crash.

Of course, the different needs of Windows Service don't invalidate the usefulness of debug.SetRuntimeFD on Linux and Mac.

Adding a special case for Windows Services to write to EventLog also sounds reasonable. It would be safer than a callback. Though, it would require a way of setting or automatically deciding on an event source name.

egonelbre on 16 Dec 2020

I hate how complex that logic is for such a fundamental operation, but maybe it would make sense to do something similar for Windows services and have their writes go straight to the event log? IMO, in both cases this should be the OS's job, but it's not doing it.

Sticking that into the Go runtime itself is not a good idea. EventLog is complicated -- more so than the Android logger functions -- and as such will require configuration. This is the kind of thing best suited to go in x/sys/windows/svc or similar. The x/sys/windows package _already_ uses //go:linkname for necessary faculties exposed by the runtime. Adding one for setting the logger callback function would be very sensible. I'll send a patch for that.

zx2c4 on 16 Dec 2020

👍1

Change https://golang.org/cl/278792 mentions this issue: runtime: allow builtin write function to be redirected with function pointer

gopherbot on 16 Dec 2020

@zx2c4 but what does the x/windows/svc side look like?

rsc on 17 Dec 2020

Go: proposal: runtime/debug: allow setting fd for crash output

All 52 comments

Related issues