While following the Prow Getting Started documentation, the following crash was encountered with the hook and tide containers. This occurs at the step "Run the prow components in the cluster" just after "Create the GitHub Secrets". The version of starter.yaml being used is the one from roughly a week ago.
This deploy was attemped on a Kops cluster with a private topology and calico networking in AWS.
I expect that this is some kind of configuration error on my part (which I'm not expecting help with), however, the error is not obvious due to the following panic, which can hopefully be handled and corrected with a proper error message in a future version:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x40 pc=0xb16ae2]
goroutine 1 [running]:
k8s.io/test-infra/prow/git.(*Client).Clean(0x0, 0xc00006a040, 0xc00017db90)
prow/git/git.go:67 +0x22
k8s.io/test-infra/prow/flagutil.(*GitHubOptions).GitClient.func1(0xc00017dc28, 0xc00017dc20)
prow/flagutil/github.go:96 +0x43
k8s.io/test-infra/prow/flagutil.(*GitHubOptions).GitClient(0xc00007d728, 0xc000039800, 0x0, 0x0, 0xf56440, 0xc0003f1d60)
prow/flagutil/github.go:107 +0x22e
main.main()
prow/cmd/hook/main.go:123 +0x413
/kind bug
/priority important-soon
cc @cjwagner
I've been trying to reproduce this error for a while, but cannot.
Your error message suggests that client is nil on line 96, but client is not modified after it is created and is only created as nil if an error is returned (see here). If an error was returned by NewClient we would not have reached line 96. https://github.com/kubernetes/test-infra/blob/0bf3b91c5bb19672ce931cfab0c8a496464aa571/prow/flagutil/github.go#L90-L98
I feel like I'm either missing something obvious or something else was changed on your end.
Nothing else was changed with regards to Tide/Prow on my end. I'm wondering if this is some odd garbage collection issue (_Extremely Narrator voice: It wasn't_) where client has been reaped (and thus becomes nil) before the method is called on it.
If we pass the client into the closure, perhaps this would be enough of a hint to Go to not let that happen.
// Current implemention of the closure
defer func() {
if err != nil {
client.Clean()
}
}()
// Suggested
defer func(client *git.Client) {
if err != nil {
client.Clean()
}
}(client)
Unfortunately, I'm not yet up to speed on building tide/hook so I have been unable to test this change outside of a small test case I built for myself.
Thinking about this some more, I think this is the correct fix. It's related to the use of the named return values and the lack of explicitly passing client into the closure.
When an error is encountered, return nil, fmt.Errorf("thing") happens. This sets the client and err values to nil and fmt.Errorf("thing").
Then the defer kicks in. Client is nil at this point, because we return nil for the client value when err != nil and the defer is using the value of client from the named return.
Passing client into the anonymous function allows the defer to capture the value of client at the moment the defer is created and should allow the defer to work as intended in the event of an error.
Linking my small Go PoC for this showing the behaviour of the named variables and defer capturing the client value, etc.
https://gist.github.com/phyber/01871748db8fdbb9b7b8d5b7e1af7f08
@phyber Nice catch!
FYI https://play.golang.org/ is great for sharing golang examples and POCs.
Most helpful comment
Thinking about this some more, I think this is the correct fix. It's related to the use of the named return values and the lack of explicitly passing
clientinto the closure.When an error is encountered,
return nil, fmt.Errorf("thing")happens. This sets theclientanderrvalues tonilandfmt.Errorf("thing").Then the
deferkicks in. Client isnilat this point, because we returnnilfor theclientvalue whenerr != niland thedeferis using the value ofclientfrom the named return.Passing
clientinto the anonymous function allows thedeferto capture the value ofclientat the moment thedeferis created and should allow thedeferto work as intended in the event of an error.