Go: os/signal: notify should filter out runtime-generated SIGURG

Created on 19 Mar 2020  路  12Comments  路  Source: golang/go

This program, when running under go 1.14, will show a huge number of SIGURG are received even though none should be (no network IO, no explicit kills).

https://play.golang.org/p/x7hFjPZnrg5

package main

import (
    "fmt"
    "os"
    "os/signal"
    "syscall"
    "time"
)

func main() {
    ch := make(chan os.Signal, 1)
    signal.Notify(ch, syscall.SIGURG)

    for {
        select {
        case sig := <-ch:
            fmt.Printf("received %v: %s\n", sig, time.Now())
        default:
            _ = new(int) // generate some GC activities
        }
    }
}

https://go.googlesource.com/proposal/+/master/design/24543-non-cooperative-preemption.md changes to use SIGURG to preempt goroutines, which is fine, but I think os/signal.Notify should filter out those runtime generated SIGURGs.

The reason I found this is that I have a program that happens to use SIGURG as a custom signal like SIGUSR1, which works fine with previous Go releases. I understand that it should probably use SIGUSR1, but still I think os/signal should hide any signals generated by the runtime as it's irrelevant for the user and an implementation detail.

/cc @aclements thoughts?

NeedsInvestigation

Most helpful comment

Yeah, it's very difficult to filter out runtime-generated SIGURG without
also occasionally remove a real SIGURG.

I guess, a slightly better solutions is to make sure the runtime only uses
SIGURG when it's necessary.
My program can tolerate occasional spurious SIGURG, but in the current
runtime implementation, it's occurring too frequently.
(Just like the example program I posted, it contains enough preempt points
that forced preemption shouldn't be necessary.)

Perhaps we can repurpose this issue to "runtime: only use forced preemption
when necessary"? Thoughts?

All 12 comments

The reason I found this is that I have a program that happens to use SIGURG as a custom signal like SIGUSR1

Right. It was an explicit decision to pass these through to the program, in case the program is using SIGURG. Unfortunately, the runtime can't tell if a signal is "runtime generated" and thus can't filter it out. Because the kernel combines signals, it's not even well-specified what it would mean for a received signal to be runtime-generated because the kernel could have combined a runtime-generated signal with a signal from another source.

@changkun, I don't really understand your program. First, I don't understand the need for Gosched there. There's no tight loop, and even if there were, I don't see what that Gosched would accomplish. Second, if a program is going to subscribe to all signals, rather than passing specific signals of interest to signal.Notify, then it should be prepared to receive any signal. There are lots of reasons lots of signals get sent. For example, if I ran that program in a terminal emulator and started resizing the window, it would begin streaming warnings about unexpected SIGWINCHes.

@minux I don't see how it is possible even in principle for os/signal to filter out signals generated from the runtime package. Do you have any suggestions as to how that could be done?

signal.Notify is to capture process-targeted signal, right? The preemption signals are thread-targeted. Maybe we can filter signals of which sigcode is SI_TKILL (at least on Linux)?

That would not work if the kernel combines signals with different sigcode. Does it?

As far as I can tell, the Linux 5.3.12 kernel does combine signals with a different si_code value. In __send_signal I see

    if (legacy_queue(pending, sig))
        goto ret;

and

static inline bool legacy_queue(struct sigpending *signals, int sig)
{
    return (sig < SIGRTMIN) && sigismember(&signals->signal, sig);
}

where signals->signal is the set of pending signals.

Yeah, it's very difficult to filter out runtime-generated SIGURG without
also occasionally remove a real SIGURG.

I guess, a slightly better solutions is to make sure the runtime only uses
SIGURG when it's necessary.
My program can tolerate occasional spurious SIGURG, but in the current
runtime implementation, it's occurring too frequently.
(Just like the example program I posted, it contains enough preempt points
that forced preemption shouldn't be necessary.)

Perhaps we can repurpose this issue to "runtime: only use forced preemption
when necessary"? Thoughts?

I found a lot of SIGURGs will timeout the runtime/regex testes of mips64le.

The origin test of runtime only takes 43s to finish, now it's 600s.

https://farmer.golang.org/temporarylogs?name=linux-mips64le-mengzhuo&rev=dcf0929de6a12103a8fd7097abd6e797188c366d&st=0xc010fd8ea0

rt_sigreturn({mask=[]})                 = -1 ENOENT (No such file or directory)     
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = 824642013144                              
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = -1 (errno 824635811712)                   
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = 18446744073709551615                      
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = -1 (errno 824642012824)                   
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = -1 (errno 824642012416)                   
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = -1 (errno 824642002944)                   
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = -1 EDOM (Numerical argument out of domain)
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = 0                                         
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = -1 (errno 1098664997869)                  
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = 18446744073709551615                      
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = 27                                        
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = -1 EFBIG (File too large)                 
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = -1 ESRCH (No such process)                
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = -1 EPERM (Operation not permitted)        
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       
rt_sigreturn({mask=[]})                 = 18446744073709551615                      
sched_yield()                           = 0                                         
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=19904, si_uid=1000} ---       

@mengzhuo Can you open a separate issue for that? It sounds like the program is having a hard time making progress, which seems more like #37741. Thanks.

Perhaps we can repurpose this issue to "runtime: only use forced preemption when necessary"?

I've considered delaying the first attempt to use signals until some time after we've tried cooperative preemption and it's failed. Most of the time cooperative preemption responds very quickly. I'm not sure what to set that threshold to, though, since every millisecond of delay there is a millisecond added to STW if there's anything not cooperating.

Could you also provide a build option to enable (or never enable) preemptive? Runtime env flags would presumably override that...

That option already exists (GODEBUG=asyncpreemptoff=1), but, as it says on the tin, it's meant for debugging.

Um, I meant a compiler flag to set the default scheduler, which the env var could override.

Was this page helpful?
0 / 5 - 0 ratings