Hi everyone, we will be submitting a patch in the near future that adds Windows/ARM32 support to GO. All but a few tests are passing, and this implementation has been used to compile GO itself and run Docker containers on Windows/ARM32. We look forward to working with the community to iron out the last remaining issues and get this merged!
I'm curious why was the port developed out-of-tree? I'm asking because I'm afraid it's much more complicated to properly review such a big change, as an port to new architecture is, in a single CL.
In any case, please check https://github.com/golang/go/blob/master/CONTRIBUTING.md and the linked from there https://golang.org/doc/contribute.html for more details.
(Personal wish: Please spell the name of the language as 'Go'.)
@jordanrh1 you should read https://github.com/golang/go/wiki/PortingPolicy
Alex
@cznic We were under a tight deadline and did not know if it was feasible. This is the earliest I could engage the community. We can break down the change into smaller chunks if necessary.
@alexbrainman Thanks for the link. We are committed to meeting the requirements outlined in the doc.
We are committed to meeting the requirements outlined in the doc.
SGTM.
Alex
Change https://golang.org/cl/125643 mentions this issue: dashboard: add windows-arm builder
Change https://golang.org/cl/125648 mentions this issue: cmd/link: support windows/arm
Change https://golang.org/cl/125649 mentions this issue: debug/pe: support windows/arm
Change https://golang.org/cl/125645 mentions this issue: runtime: add definitions for windows/arm
Change https://golang.org/cl/125647 mentions this issue: cmd/api: support windows/arm
Change https://golang.org/cl/125646 mentions this issue: cmd/dist: support windows/arm
Change https://golang.org/cl/128715 mentions this issue: runtime: support windows/arm
Change https://golang.org/cl/128716 mentions this issue: syscall: support windows/arm
Change https://golang.org/cl/130056 mentions this issue: cmd/dist: support windows/arm
Change https://golang.org/cl/131282 mentions this issue: cmd/vet: remove exclusions for callbackasm
Change https://golang.org/cl/127665 mentions this issue: windows: add support for windows/arm
Change https://golang.org/cl/137015 mentions this issue: cmd/vendor: update golang.org/x/sys/windows for windows/arm support
Sounds like the builder is still failing (#28854). @jordanrh1, what work remains to have this port stable for the release?
These are the failing runtime tests:
Some are due to out of memory, and the rest I believe are due to issues with traceback.
It looks like there are some new failures in html/template.
Some are due to out of memory,
I am pretty sure it is OK to skip tests that take too much memory on RPI.
and the rest I believe are due to issues with traceback.
Please, see if you can debug and fix it. Feel free to ask questions - everyone will help.
It looks like there are some new failures in html/template.
I can see from your comment https://github.com/golang/go/issues/28854#issuecomment-440520454 it could be same bug.
Alex
Change https://golang.org/cl/153518 mentions this issue: builders: skip writing snapshop for windows/arm builder
Change https://golang.org/cl/153719 mentions this issue: runtime/pprof: fix TestCPUProfileWithFork on Windows/ARM
Change https://golang.org/cl/153718 mentions this issue: runtime: fix profiling on windows/ARM
Change https://golang.org/cl/153839 mentions this issue: runtime: pass LR to sigprof on windows/arm
Change https://golang.org/cl/153844 mentions this issue: link/internal/ld: fix TestRuntimeTypeAttrInternal on windows/arm
Change https://golang.org/cl/153830 mentions this issue: cmd/link: skip TestRuntimeTypeAttrInternal on windows/arm
Change https://golang.org/cl/154357 mentions this issue: cmd/nm: report windows/arm as relocatable in TestGoExec
@hyangah Does the order of regions matter in TestAnalyzeAnnotations
? This test is failing on windows/arm because the order of regions is different from what is expected.
On Linux:
task 1: task0
start: 33721 end: 62861 complete: true
2 goroutines
4 regions:
task0.region0(goid=25)
(goid=26)
task0.region1(goid=26)
task0.region2(goid=26)
1 children:
task1
On Windows:
task 1: task0
start: 1006000 end: 673010992 complete: true
2 goroutines
4 regions:
task0.region0(goid=43)
(goid=44)
task0.region2(goid=44)
task0.region1(goid=44)
1 children:
task1
On Linux, the order of regions is 0, (unnamed), 1, 2. On Windows, the order of regions is 0, (unnamed), 2, 1. Is ordering significant or should the test be changed to accept an arbitrary ordering of regions?
Change https://golang.org/cl/154560 mentions this issue: windows: use netevent.dll in TestFormatMessage for windows/arm
@jordanrh1 the regions are supposed to be sorted based on the timestamp so I think it needs investigation.
Can you capture the trace with the command and share it?
go test -run=TestAnalyzeAnnotations runtime/trace -savetraces
@hyangah, Here is the trace, thank you for looking :)
TestAnalyzeAnnotations.trace.zip
Change https://golang.org/cl/154777 mentions this issue: cmd/trace: force regions order in TestAnalyzeAnnotations
@jordanrh1, thanks you for the trace.
According to the trace, the system didn't seem to provide high-enough time resolution required for this testing or for the execution tracer to be useful. All the traced events were timestamped either 0 or 1013008ns. So, sorting based on the timestamp doesn't guarantee a unique ordering this test assumes.
I sent the cl/154777 to slow the traced program (hoping that 1ms gap between events is sufficient).
Alternatively, we can just skip the test on windows.
@hyangah Thanks for looking. The high resolution clock runs at 1Mhz on this machine, which I think would be fast enough to get meaningful timestamps. There could be a bug in the timestamp code. I will investigate.
@jordanrh1 I see. The traced code is just creating the events back to back, so requiring them to have distinct timestamps is unreasonable. They all can happen within 1usec and end up with the same timestamp. So, I think the assumption the test makes is wrong. I will just sort the region list based on the name.
Change https://golang.org/cl/154817 mentions this issue: windows/svc: use wevtutil.exe instead of powershell for windows/arm
@hyangah I'm seeing a resolution of about 1ms in nanotime() on windows/arm. I think there is a problem with the implementation.
Change https://golang.org/cl/154758 mentions this issue: runtime: use QPC for nanotime and time.now on windows/arm
@hyangah FYI, the test is passing after fixing the time functions.
Change https://golang.org/cl/154761 mentions this issue: Revert "runtime: use QPC for nanotime and time.now on windows/arm"
Change https://golang.org/cl/154762 mentions this issue: runtime: use QPC to implement cputicks() on windows/arm
Getting closer:
The failing test is TestCallbackPanic()
. I'm suspecting stack corruption.
I also want to modify cputicks()
to call onosstack(_QueryPerformanceCounter, ...)
, and I want to modify onosstack()
to not modify any global data. These routines are called deep in the scheduler and I am concerned they could be corrupting state.
Here is the issue with TestCallbackPanic().
TestCallbackPanic()
tests what happens when Go code running in a callback panics. Go loads a native dll (user32.dll), calls a function that is expected to invoke a callback (EnumWindows), and then panics within the callback. It registers deferrals at several points, and verifies that the deferrals run and that the panic can be recovered from. In the Go function that calls EnumWindows
, it registers a deferral to unload user32.dll
when the function returns. That way, user32.dll
will be unloaded no matter what, even if EnumWindows
panics. This function looks like this:
// nestedCall calls into Windows, back into Go, and finally to f.
func nestedCall(t *testing.T, f func()) {
c := syscall.NewCallback(callback)
d := GetDLL(t, "user32.dll")
defer d.Release()
d.Proc("EnumWindows").Call(c, uintptr(*(*unsafe.Pointer)(unsafe.Pointer(&f))))
}
The problem arises in the interaction between EnumWindows
and Go's panic unwinding logic. When Go encounters a panic in a callback, it unwinds the g0 stack to the point before the outgoing cgo call that invoked the callback was made. That means in this case it will reset the g0 stack pointer to what it was just before the call to EnumWindows
. If you look at the stack pointer on entry of EnumWindows
, and on entry of FreeLibrary
(which is called by defer d.Release()
), they are the same value. It is my understanding that this is by design. The logic that restores the stack pointer is in func unwindm
:
// Restore sp saved by cgocallback during
// unwind of g's stack (see comment at top of file).
mp := acquirem()
sched := &mp.g0.sched
switch GOARCH {
default:
throw("unwindm not implemented")
case "386", "amd64", "arm", "ppc64", "ppc64le", "mips64", "mips64le", "s390x", "mips", "mipsle":
sched.sp = *(*uintptr)(unsafe.Pointer(sched.sp + sys.MinFrameSize)) // <-- restore sp from saved value on the stack
case "arm64":
sched.sp = *(*uintptr)(unsafe.Pointer(sched.sp + 16))
}
It restores the g0 stack pointer from the value saved on the stack by cgocallback
. cgocallback
saves m->g0->sched.sp
to the stack, which was previously written by cgocall
when EnumWindows
was called.
havem:
// Now there's a valid m, and we're running on its m->g0.
// Save current m->g0->sched.sp on stack and then set it to SP.
// Save current sp in m->g0->sched.sp in preparation for
// switch back to m->curg stack.
// NOTE: unwindm knows that the saved g->sched.sp is at 4(R13) aka savedsp-8(SP).
MOVW m_g0(R8), R3
MOVW (g_sched+gobuf_sp)(R3), R4 <--- load last saved value of g0's SP (when EnumWindows was invoked)
MOVW R4, savedsp-8(SP) <--- save this value to the stack, where unwindm knows where to find it
MOVW R13, (g_sched+gobuf_sp)(R3)
So, the fact that we see FreeLibrary
execute with the same stack pointer as EnumWindows
seems to be by design. EnumWindows
was not allowed to complete since a panic occurred while it was calling back into Go code, and it's stack was deallocated. This shouldn't be an issue, since it is generally an error to maintain references to stack local memory that outlive the stack frame.
This is where things get weird. EnumWindows
isn't as straightforward as you might think. It allocates memory from the heap before invoking the callback, and frees this memory before it returns. It uses two mechanisms to ensure this memory gets cleaned up in case the callback experiences an error:
When the user32 module is unloaded, its garbage collector runs and scans for outstanding resources, which include outgoing calls that have not been completed, and their associated resources. Since the SEH handler never ran (because an explicit call to panic()
does not trigger SEH handlers), the global tracking database still has a reference to the bookkeeping information that was stored on the stack. However, this stack space was deallocated, and new code is running over it, which corrupts the data. The bookkeeping code faults when it loads a bad pointer from the stack.
As you can see, this error occurs due to very specific interaction between Go's deferral mechanism, Go's callback mechanism, and user32's resource tracking mechanism. In order for C code to be go-panic-safe, it should not allocate any resources, and should not maintain any references to stack memory.
In terms of fixes, I'm not sure there's a way to ensure that C code which relies on SEH or C++ exceptions has a chance to clean up in case of a go panic. It seems best to recover() in the callback, and not panic across callback boundaries. Another option is to find a replacement for EnumWindows
that is safe to panic across and use that for the test instead. @alexbrainman @ianlancetaylor , what do you think?
Any idea why the test passes on x86? What's the key difference?
I think it has to do with how subsequent code overwrites the stack. If the data structure on the stack doesn't get overwritten, the error will not manifest. amd64 and x86 have the same underlying issue, but they must not corrupt the bookkeeping datastructure.
I'm not sure there's a way to ensure that C code which relies on SEH or C++ exceptions has a chance to clean up in case of a go panic.
I agree. This is not implemented, and it won't be simple to implement.
It seems best to recover() in the callback, and not panic across callback boundaries.
I agree in general, but I don't think it is acceptable for our runtime tests. I think Go users write code that panics from callback and into call site function. We should try and make it work, if possible.
Another option is to find a replacement for
EnumWindows
that is safe to panic across and use that for the test instead
That sounds good. I don't remember who came up with EnumWindows
to test it all, but anything else should be fine too. As long as it works and easily available on any Windows computer that is supported by Go.
Alex
@ianlancetaylor One of the reasons it doesn't readily manifest on amd64/386 is that amd64/386 tests run on Windows Desktop, which has a different implementation of user32 than IoT Core, which is where the ARM tests run. Iot Core uses 'minuser', which is a completely rewritten window manager. I was able to reproduce the exact crash on AMD64 by running the test on IoT Core. In the deferral, I called a function that used a large chunk of g0 stack (in a legal way), to ensure that any stack based data structures still referenced would be overwritten with garbage data. Later, when the user32 garbage collector crashed, it crashed in exactly the same place as ARM, because the stack-based bookkeeping structure had been overwritten in the deferral.
On ARM, to further confirm that stack corruption was the issue, I inserted code in callbackasm1
to save the current g0 SP so that when unwindm
restores the g0 SP, it will restore it to a value that is safely below the stack for EnumWindows
. With this change, the crash went away because EnumWindow's stack was no longer being overwritten in the deferral. The test ran many times successfully.
Given these findings I have increased confidence that this is the issue.
Change https://golang.org/cl/155923 mentions this issue: runtime: use EnumTimeFormatsEx instead of EnumWindows in callback tests
Exciting stuff:
@jordanrh1 Is there an issue for porting to Windows/ARM64 too? Thanks.
@petemoore There is not.
It seems exes built with GOOS=windows GOARCH=arm with the current master won't run on Windows RT 8.1, failing with "This app can't run on your PC". This is with code signature checking subverted. I haven't looked into it yet so I'm unsure of details and I know this platform is out of support, but is this a configuration that theoretically should be working? By the way, fantastic work.
You also need to specify GOARM=7
. I'll look into whether there's a way to do this by default.
Unfortunately it's still failing with the same error. I made sure to clear the Go build cache and delete the output .exe, so I'm fairly sure it is getting rebuilt properly. I've put my test file / command below.
$ cat hello.go
package main
import "fmt"
func main() {
fmt.Println("hello, world")
}
$ GOOS=windows GOARCH=arm GOARM=7 ./go/bin/go build hello.go
If you have time, try starting it under the usermode debugger, and see if it offers any clues:
windbg.exe hello.exe
If it fails to start, then it's probably being rejected by the loader. To troubleshoot loader errors, connect kernel debugger and turn on loader snaps,
!gflag +sls
Then try running the program. Loader spew will be printed to the windbg window.
I definitely have time to do some tests and try to fix this, but likewise, don't worry about this platform too much if nobody considers it important!
Unfortunately it appears security restrictions in Windows RT prevent the debugger from creating or attaching to processes. If you know of any other way to get log messages out of the linker / loader I'll happily try that, otherwise I'll reply later if I end up getting the debugger working somehow.
Edit: Running the program a different way, I can see it produces ERROR_BAD_EXE_FORMAT (%1 is not a valid Win32 application.)
@jordanrh1 I managed to run windows-arm on one of my RPIs. The only bits I am still missing is git command. What do you do for git on windows-arm?
@bradfitz is there a git command that is written in pure Go? I would just compile it to windows-arm and use it there. I would not need anything fancy, just basic git functionality.
Thank you.
Alex
@alexbrainman, the builders don't use git. They only run the x/build/cmd/buildlet binary and the cmd/coordinator pushes code in tarballs at them.
@alexbrainman, the builders don't use git.
Fair enough. I suppose I have to find a way to develop on this thing.
Alex
I do all my development on linux and cross compile to windows/arm.
I do all my development on linux and cross compile to windows/arm.
I do too.
But I was trying to fix #30711. And that involved running cmd/go.TestScript, and cmd/go.TestScript runs go.exe executable. So I needed all moving parts. I used $GOROOT/src/bootstrap.bash (luckily for me it worked without a hitch) to create a snapshot of current Go tree with executables and built packages, and copied all these files onto RPI. And then run the test.
I did not have Git, so I had to be careful when changing files to fix #30711. The fix was small enough for me to manage without Git. But I would not use that setup for anything bigger than that.
RPI runs standard Windows file and printer sharing services, and I was able to mount shares from RPI on my Windows PC. Maybe (I did not try that) I could just use Git on my Windows PC to manage Go repo installed on RPI. I will see next time I use that.
Another thing. While debugging, I searched for windows-arm prebuilt image on https://golang.org/dl and it is not there. @bradfitz maybe we should put latest windows-arm image at https://golang.org/dl. I doubt there is need for ,msi file, but we should put .zip file alongside with 386 and amd64.
Also @jordanrh1 I struggled to make Windows 10 running on some of my RPIs.
I have RPI 3 B+. I downloaded "IoT Dashboard", and it gives me option of installing "Windows 10 IoT Core (17763)". I created SD card (ScanDisk Ultra 16GB), but RPI does not boot - red power light comes on, and I get rainbow image on my HDMI screen, but nothing happens.
I also have RPI 2 B. I tried booting this with the same image - red light is on, green light flashes 4 slow times followed by 4 quick times, same rainbow image on the screen.
I managed to run Windows on RPI 3 B. But I would like to make my other RPIs run Windows. Do you have any suggestions?
Also I noticed that windows-arm builder is not listed on https://build.golang.org Can you, please, investigate what happened to it?
I also wonder, if we could run windows-arm build on multiple RPI devices to make it faster?
Maybe we could even run it on virtual hardware (crazy idea). Like
https://www.newventuresoftware.com/blog/running-windows-10-iot-core-in-a-virtual-machine
or
Thank you.
Alex
@jordanrh1, per http://golang.org/wiki/PortingPolicy#requirements-for-a-new-port, “[t]he builder must already be running” for new ports. I don't see any results for the windows-arm
builder on https://build.golang.org since March 20.
Has this port been abandoned?
Hi - there is already a port available at https://github.com/thongtech/go-windows-arm64 - does this help ?
Most helpful comment
I'm curious why was the port developed out-of-tree? I'm asking because I'm afraid it's much more complicated to properly review such a big change, as an port to new architecture is, in a single CL.
In any case, please check https://github.com/golang/go/blob/master/CONTRIBUTING.md and the linked from there https://golang.org/doc/contribute.html for more details.
(Personal wish: Please spell the name of the language as 'Go'.)