Go: race: not working with Alpine based image

Created on 23 Feb 2016  路  74Comments  路  Source: golang/go

Hi,

With the following docker image (save it as demo.docker)

FROM golang:1.6.0-alpine
MAINTAINER "[email protected]"

RUN apk add --update alpine-sdk \
    && rm -rf /var/cache/apk/*

run

docker build -f demo.docker -t race/demo .

Then you can finally run the command:

PROJECT_DIR="${PWD}" #assume we are in $GOPATH/src/github.com/dlsniper/demo on the computer
CONTAINER_PROJECT_DIR="/go/src/github.com/dlsniper/demo"
CONTAINER_PROJECT_GOPATH="${CONTAINER_PROJECT_DIR}/vendor:/go"

docker run --rm \
        --net="host" \
        -v ${PROJECT_DIR}:${CONTAINER_PROJECT_DIR} \
        -e CI=true \
        -e GODEBUG=netdns=go \
        -e CGO_ENABLED=1 \
        -e GOPATH=${CONTAINER_PROJECT_GOPATH} \
        -w "${CONTAINER_PROJECT_DIR}" \
        race/demo \
        go test -v -race ./...

This will fail with:

# runtime/race
race_linux_amd64.syso: In function `__sanitizer::InternalAlloc(unsigned long, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<0ul, 140737488355328ull, 0ul, __sanitizer::SizeClassMap<17ul, 64ul, 14ul>, 20ul, __sanitizer::TwoLevelByteMap<32768ull, 4096ull, __sanitizer::NoOpMapUnmapCallback>, __sanitizer::NoOpMapUnmapCallback> >*)':
gotsan.cc:(.text+0x1681): undefined reference to `__libc_malloc'
race_linux_amd64.syso: In function `__sanitizer::ReExec()':
gotsan.cc:(.text+0xd937): undefined reference to `__libc_stack_end'
race_linux_amd64.syso: In function `__sanitizer::InternalFree(void*, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<0ul, 140737488355328ull, 0ul, __sanitizer::SizeClassMap<17ul, 64ul, 14ul>, 20ul, __sanitizer::TwoLevelByteMap<32768ull, 4096ull, __sanitizer::NoOpMapUnmapCallback>, __sanitizer::NoOpMapUnmapCallback> >*)':
gotsan.cc:(.text+0x5ec8): undefined reference to `__libc_free'
collect2: error: ld returned 1 exit status

If you then disable CGO and run again:

PROJECT_DIR="${PWD}" #assume we are in $GOPATH/src/github.com/dlsniper/demo on the computer
CONTAINER_PROJECT_DIR="/go/src/github.com/dlsniper/demo"
CONTAINER_PROJECT_GOPATH="${CONTAINER_PROJECT_DIR}/vendor:/go"

docker run --rm \
        --net="host" \
        -v ${PROJECT_DIR}:${CONTAINER_PROJECT_DIR} \
        -e CI=true \
        -e GODEBUG=netdns=go \
        -e CGO_ENABLED=0 \
        -e GOPATH=${CONTAINER_PROJECT_GOPATH} \
        -w "${CONTAINER_PROJECT_DIR}" \
        race/demo \
        go test -v -race ./...

It will result in the following output:

go test: -race requires cgo; enable cgo by setting CGO_ENABLED=1

Previously, in go 1.5.3, when running with CGO disabled, this used to fail with:

# testmain
runtime/race(.text): __libc_malloc: not defined
runtime/race(.text): getuid: not defined
runtime/race(.text): pthread_self: not defined
runtime/race(.text): madvise: not defined
runtime/race(.text): madvise: not defined
runtime/race(.text): madvise: not defined
runtime/race(.text): sleep: not defined
runtime/race(.text): usleep: not defined
runtime/race(.text): abort: not defined
runtime/race(.text): isatty: not defined
runtime/race(.text): __libc_free: not defined
runtime/race(.text): getrlimit: not defined
runtime/race(.text): pipe: not defined
runtime/race(.text): __libc_stack_end: not defined
runtime/race(.text): getrlimit: not defined
runtime/race(.text): setrlimit: not defined
runtime/race(.text): setrlimit: not defined
runtime/race(.text): setrlimit: not defined
runtime/race(.text): exit: not defined
runtime/race(.text.unlikely): __errno_location: not defined
runtime/race(.text): undefined: __libc_malloc
/usr/local/go/pkg/tool/linux_amd64/link: too many errors

To test this just change the base image for the container.

Please let me know if there are any additional details I can share.

Thank you.

NeedsFix help wanted

Most helpful comment

In case it helps anyone else, I rebased @neelance's compiler-rt changes (https://github.com/golang/go/issues/14481#issuecomment-281972886) on top of compiler-rt@fe2c72c5 (as used by Go 1.13) and have been able to run -race using:

FROM golang:1.13-alpine3.10 as golang

# Make go test -race work on alpine by building (patched) sanitizer manually
# as it is not built by default
# Ref: https://github.com/golang/go/issues/14481#issuecomment-281972886
# SHA: https://github.com/golang/go/blob/go1.13/src/runtime/race/README
COPY 0001-hack-to-make-Go-s-race-flag-work-on-Alpine.patch /race.patch
RUN cd / \
    && apk add --no-cache --virtual .build-deps g++ git \
    && mkdir -p compiler-rt \
    && git clone https://llvm.org/git/compiler-rt.git \
    && cd compiler-rt \
    && git reset --hard fe2c72c59aa7f4afa45e3f65a5d16a374b6cce26 \
    && patch -p1 -i /race.patch \
    && cd lib/tsan/go \
    && ./buildgo.sh 2>/dev/null \
    && cp -v race_linux_amd64.syso /usr/local/go/src/runtime/race/ \
    && rm -rf /compiler-rt /race.patch \
    && apk del .build-deps

0001-hack-to-make-Go-s-race-flag-work-on-Alpine.patch.gz

All 74 comments

CC @dvyukov

As far as I know this is because race.syso assumes a glibc based system. I don't know that there is a simple fix here.

I can't think of any simple fix. A complex fix would be to remove all dependencies on libc from race runtime (there is an open issue for that).
alpinelinux wiki suggests that there are some ways to run glibc-based programs on alpine:
http://wiki.alpinelinux.org/wiki/Running_glibc_programs
Don't know whether it will help for race detector or not.

hi, all,

just wondering. does anyone have a complete working example of a golang program working with -race ideally not that much different from the official Alpine-based build.

out of interest, are the flags not tested? i would imagine that this sort of issue would be picked up at compile time, unless the Alpine-based is configured to to include it... no idea, just guessing.

anywho, working sample would be much appreciated.

I have managed to build a race_linux_amd64.syso that works on Alpine. Here are the necessary changes: https://github.com/neelance/compiler-rt/commit/32aa655a999d33a4034ec4aea00ed3df1f9e77df I haven't verified that it still works on other Linux platforms.

To build on Alpine:

  1. Clone https://github.com/neelance/compiler-rt/
  2. Go into /compiler-rt/lib/tsan/go/
  3. ./buildgo.sh (The test step crashes, but the library works anyways. My guess is that the test does not use a proper Go environment.)
  4. Copy race_linux_amd64.syso to $GOPATH/src/runtime/race/

I'm not sure how to continue from here. Any suggestions on getting that upstream?

Here are instructions on how to contribute to sanitizers:
https://github.com/google/sanitizers/wiki/AddressSanitizerHowToContribute

I skimmed through your patch and I think we can upstream something like this. But we need some testing story, otherwise it will break in future.

One good test might be to whitelist all libc symbol dependencies in buildgo.sh. This way we can ensure that we won't silently add a new dependency (ideally in future this list becomes empty).

@dvyukov I see that you did most of the updates of the syso files. I have some more questions:

  1. How can we test the patch properly on the Go end across all platforms so we don't upstream something stupid?
  2. Can we somehow add Alpine to the platforms being tested?
  3. We need to fix that crash on the test phase of buildgo.sh. I've investigated it a bit, seems like mmap is failing for some shadow memory. My wild guess right now is that it expects a Go memory layout, but the test is a C program. That was as far as I could get without spending a massive amount of time to understand how that thread sanitizer works.

How can we test the patch properly on the Go end across all platforms so we don't upstream something stupid?

I now use golang.org/x/build/cmd/racebuild which builds and tests race runtime for all platforms. But it requires gomote access (I think you need to be a committer for Go repo). If you don't have that access, I will do testing. But test at least on Alpine and on glibc-based Linux.

Can we somehow add Alpine to the platforms being tested?

You need to either run your own Go builder and connect it to dashboard, or we need to setup Alpine linux builder on GCE. @bradfitz, do we want to setup/maintain it?

seems like mmap is failing for some shadow memory. My wild guess right now is that it expects a Go memory layout

It's possible. Does removing -fPIC -fpie from buildgo.sh help? We still need to build the runtime with -pie, but the test itself does not need to be pie.

17891 tracks adding an Alpine builder. @jessfraz has a (stalled :)) CL in progress.

We now have Alpine builders, so in theory golang.org/x/build/cmd/racebuild could build an Alpine race binary now.

For now I'm just going to be skipping race tests on Alpine so we don't regress on other things.

CL https://golang.org/cl/41678 mentions this issue.

What is the latest update on this issue?

@djui, there is no update. Nobody on the Go team is working on this, and nobody elsewhere in the community has posted anything here about it. It's all yours if you want to work on it.

Hi @bradfitz I think it's ok to close this issue. It's not fair to leave it open as an issue if it's clear by now this does not work by design without further work.

As you said, if anyone wants to pick it up, great!

For 馃憥 'ers: Would you agree?

@djui why should the issue be closed if it's still an issue, even if it doesn't have anyone currently working on it? By that logic, most of the current issues should be closed, no?

This caused a moderate amount of pain at my job, as most CI build/test steps happen on Alpine-based images. We had to specifically pull in Debian-based images just for the go test -race steps. So I'd say we do want to keep the issue open, even if there aren't short-term plans to get it done.

@dlsniper My reasoning is focusing on the category:"Is it a bug or a feature" rather than:"If feature, is a requested feature implemented by now." To me, the issue was opened in spirit of a bug, imho, and it became during the conversation that no promises or constraints are violated, thus more of a feature (getting down to: "Should any C stdlib be supported by Go?" or "Is Musl a supported 'arch'?"). I may misclassify this and indeed any of these should be supported, therefore the question for feedback.

So maybe we could open a new ticket, link this ticket, with e.g. race: Make compiling and running race detector work with Musl. wdyt?

@mvdan Understandably, same here.

So maybe we could open a new ticket, link this ticket, with e.g. race: Make compiling and running race detector work with Musl. wdyt?

I can change the title of this issue but I don't think that will speed up the fix for it. Unfortunately, I don't have the time or the knowledge to work on this and it seems that nobody else from all the referenced issues has any plans on fixing this either. However, it seems that people find this issue and it's easy for them to link to it so may I suggest to leave this like they are until someone figures how to fix this? Thank you.

@dlsniper Works for me, Florin.

@djui @dlsniper @bradfitz any update on this issue. We upgraded go to 1.11 and since then have been facing this issue on all our build nodes. Any timeline will appreciate

What does Go 1.11 have to do with this? It never worked on any version.

Looks like maybe when we did go upgrade we might have changed the OS type (need to find that out internally).

Any idea when I should expect a fix for the issue?

In the last 14 days since the last time I wrote that I don't have the time or knowledge to fix this nothing has changed.

As I mentioned above, if someone really wants this to be fixed, I would advise to either read the above comments and patiently wait for a fix (asking about progress in the absence of any indication of progress will not create progress) or try and figure out how to fix it yourself.

Unfortunately, I don't have a better answer for you.

@jgheewala as an intermediate suggestion: if you can tolerate the trade-offs of the suggestion described below and your build/CI/CD pipeline supports to use Docker, you could run your tests in a Debian/CentOS/... image running with race-detector but ship/deploy in a non-race detector Alpine image.

ok will do thanks a lot @djui

In case it helps anyone else, I rebased @neelance's compiler-rt changes (https://github.com/golang/go/issues/14481#issuecomment-281972886) on top of compiler-rt@fe2c72c5 (as used by Go 1.13) and have been able to run -race using:

FROM golang:1.13-alpine3.10 as golang

# Make go test -race work on alpine by building (patched) sanitizer manually
# as it is not built by default
# Ref: https://github.com/golang/go/issues/14481#issuecomment-281972886
# SHA: https://github.com/golang/go/blob/go1.13/src/runtime/race/README
COPY 0001-hack-to-make-Go-s-race-flag-work-on-Alpine.patch /race.patch
RUN cd / \
    && apk add --no-cache --virtual .build-deps g++ git \
    && mkdir -p compiler-rt \
    && git clone https://llvm.org/git/compiler-rt.git \
    && cd compiler-rt \
    && git reset --hard fe2c72c59aa7f4afa45e3f65a5d16a374b6cce26 \
    && patch -p1 -i /race.patch \
    && cd lib/tsan/go \
    && ./buildgo.sh 2>/dev/null \
    && cp -v race_linux_amd64.syso /usr/local/go/src/runtime/race/ \
    && rm -rf /compiler-rt /race.patch \
    && apk del .build-deps

0001-hack-to-make-Go-s-race-flag-work-on-Alpine.patch.gz

how to solve the problem?

runtime/race /usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function __sanitizer::GetArgv()': gotsan.cc:(.text+0x4183): undefined reference to __libc_stack_end' /usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function __sanitizer::ReExec()': gotsan.cc:(.text+0x9797): undefined reference to __libc_stack_end' /usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld:

If you use Docker, I have an example with a multi stage build here where the fix mentioned above is built and then overwrites the existing file in the final image.

If you use Docker, I have an example with a multi stage build here where the fix mentioned above is built and then overwrites the existing file in the final image.

It looks like as of alpine 3.11 these patch/build no longer work :(

Alpine 3.11 already! I really hope the Go team figures out a fix (even temporary). I'll see if I can do anything.

If it's for a production Docker image, you can probably use a debian image to run tests in a multi stage build, or use a Debian host in a CI pipeline.

On my side, I run my development environment in an Alpine based Docker image so not having -race working is definitely annoying. And I like Alpine for its size and simplicity over Debian. I guess I'll likely stay on the 3.10 boat.

Actually maybe it got fixed in golang:1.13-alpine3.11? I'll check later.

@EricByers what didn't work? Do you have build/runtime output?

No obvious reason why my patch from https://github.com/golang/go/issues/14481#issuecomment-531703266 wouldn't still work on newer alpine

@EricByers what didn't work? Do you have build/runtime output?

No obvious reason why my patch from #14481 (comment) wouldn't still work on newer alpine

I'm not sure -- I'm using a variant of the one posted above, but tried the steps in your multi-stage build as well.

I know some of this was failing before, but it worked -- so not sure which is the new piece.

gcc gotsan.cc -c -o ./race_linux_amd64.syso -I../rtl -I../.. -I../../sanitizer_common -I../../../include -std=c++11 -m64 -Wall -fno-exceptions -fno-rtti -DSANITIZER_GO=1 -DSANITIZER_DEADLOCK_DETECTOR_VERSION=2 -fPIC -Wno-maybe-uninitialized -ffreestanding -Wno-unused-const-variable -Werror -Wno-unknown-warning-option -DSANITIZER_DEBUG=0 -O3 -fomit-frame-pointer -msse3
./gotsan.cc: In function '__sanitizer::uptr __sanitizer::internal_clone(int (*)(void*), void*, int, void*, int*, void*, int*)':
./gotsan.cc:13040:56: error: listing the stack pointer register 'rsp' in a clobber list is deprecated [-Werror=deprecated]
13040 |                        : "rsp", "memory", "r11", "rcx");
      |                                                        ^
./gotsan.cc:13040:56: note: the value of the stack pointer after an 'asm' statement must be the same as it was before the statement
At global scope:
cc1plus: error: unrecognized command line option '-Wno-unknown-warning-option' [-Werror]
cc1plus: all warnings being treated as errors

Edit: I got some wires crossed -- this isn't using the patch file you provided, but close to the style qdm provided.

On my side, I can't build cd lib/tsan/go && ./buildgo.sh on Alpine 3.11. I also get the error

gcc gotsan.cc -c -o ./race_linux_amd64.syso -I../rtl -I../.. -I../../sanitizer_common -I../../../include -std=c++11 -m64 -Wall -fno-exceptions -fno-rtti -DSANITIZER_GO=1 -DSANITIZER_DEADLOCK_DETECTOR_VERSION=2 -fPIC -Wno-maybe-uninitialized -ffreestanding -Wno-unused-const-variable -Werror -Wno-unknown-warning-option -DSANITIZER_DEBUG=0 -O3 -fomit-frame-pointer -msse3
./gotsan.cc: In function '__sanitizer::uptr __sanitizer::internal_clone(int (*)(void*), void*, int, void*, int*, void*, int*)':
./gotsan.cc:13040:56: error: listing the stack pointer register 'rsp' in a clobber list is deprecated [-Werror=deprecated]
13040 |                        : "rsp", "memory", "r11", "rcx");
           |                                                        ^
./gotsan.cc:13040:56: note: the value of the stack pointer after an 'asm' statement must be the same as it was before the statement
At global scope:
cc1plus: error: unrecognized command line option '-Wno-unknown-warning-option' [-Werror]
cc1plus: all warnings being treated as errors
ERROR: executor failed running [/bin/sh -c cd lib/tsan/go &&     ./buildgo.sh]: runc did not terminate sucessfully

I changed my code to build Go with the patch on Alpine3.10 and copy pasta it to an Alpine3.11 and it seems to work. @EricByers do you mind sharing what's your goal?

The patch described by dnwe is the same used in my Dockerfile, although his Dockerfile uses a golang image based on Alpine3.10, explaining why it works.

@qdm12 What source code are you using to build gotsan.cc? The problematic link you are citing seems to have been removed from the compiler-rt sources in 2018: https://github.com/llvm-mirror/compiler-rt/commit/2669a4feea0d67bf7e3041f3ddad7615859762b5

I use:

FROM golang:1.13-alpine3.10 AS race
WORKDIR /tmp/race
RUN apk --update -q --progress --no-cache add git g++
RUN git clone --single-branch https://llvm.org/git/compiler-rt.git . &> /dev/null
RUN git reset --hard fe2c72c59aa7f4afa45e3f65a5d16a374b6cce26 && \
    wget -q https://github.com/golang/go/files/3615484/0001-hack-to-make-Go-s-race-flag-work-on-Alpine.patch.gz -O patch.gz && \
    gunzip patch.gz && \
    patch -p1 -i patch
RUN cd lib/tsan/go && \
    ./buildgo.sh &> /dev/null

And copy race_linux_amd64.syso to my final image with:

COPY --from=race /tmp/race/lib/tsan/go/race_linux_amd64.syso /usr/local/go/src/runtime/race/race_linux_amd64.syso

I'm don't think another commit would work with the patch mentioned above. Or are you thinking about something else? Sorry for being a bit ignorant here.

EDIT: The error I get with the base vanilla image golang:1.13-alpine3.11 (and alpine3.10 as well) without the patch:

 ~ go test -race ./...
# runtime/race
/usr/lib/gcc/x86_64-alpine-linux-musl/9.2.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function `__sanitizer::GetArgv()':
gotsan.cc:(.text+0x4183): undefined reference to `__libc_stack_end'
/usr/lib/gcc/x86_64-alpine-linux-musl/9.2.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function `__sanitizer::ReExec()':
gotsan.cc:(.text+0x9797): undefined reference to `__libc_stack_end'
/usr/lib/gcc/x86_64-alpine-linux-musl/9.2.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function `__sanitizer::InternalAlloc(unsigned long, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*, unsigned long)':
gotsan.cc:(.text+0xaac1): undefined reference to `__libc_malloc'
/usr/lib/gcc/x86_64-alpine-linux-musl/9.2.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function `__sanitizer::InternalRealloc(void*, unsigned long, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*)':
gotsan.cc:(.text+0xca20): undefined reference to `__libc_realloc'
/usr/lib/gcc/x86_64-alpine-linux-musl/9.2.0/../../../../x86_64-alpine-linux-musl/bin/ld: race_linux_amd64.syso: in function `__sanitizer::InternalFree(void*, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*)':
gotsan.cc:(.text+0x66e8): undefined reference to `__libc_free'
collect2: error: ld returned 1 exit status

@qdm12 OK, so whoever wrote that patch needs to update it. And maybe try to submit it to the compiler-rt project so that people don't have to patch it themselves.

This issue is filed against the Go project but I don't understand what the Go team can do to fix it. It seems to involve code that we do not control.

Ok I'll see what I can do and report back.

Although the Go team could test that the race detector works in the Docker image building CI with all the base images targeted. Maybe I'll add a pull request for it too.

@ianlancetaylor I _thought_ the Go-specific parts of llvm/compiler-rt were maintained by @dvyukov who I _thought_ was a member of the Go team.

Whilst one of us could pop a pull request in for this change it feels like someone from the Go team would be best placed to verify that what the patch changes is valid. Even once that's done it would still need the build and packaging changes that @bradfitz mentioned before it would be available as part of the Go release downloads

@dvyukov has made an enormous number of valuable contributions to Go, but he is not a member of the Go team. And although I haven't spoken to him recently my understanding is that these days he is focused largely on syzkaller (https://github.com/google/syzkaller).

That said, sure, we can take a look at a change to the compiler-rt project. But we can't approve it. We certainly can and will do the packaging changes on the Go side when the compiler-rt project is updated.

I'm not trying to duck out of the appropriate work here. I'm saying that you shouldn't sit back and wait for the Go team to fix this. That is fairly unlikely to happen.

Did a bit of digging, it seems the master branch of compiler-rt contains the modifications described by the patch mentioned above.

However, trying cd lib/tsan/go && ./buildgo.sh will not compile on Alpine, probably because of musl instead of glibc. I tried compiling it with debian, which works, but the resulting race_linux_amd64.syso is targeted at glibc, hence not working with Alpine.

It's unclear to me what changed between commit fe2c72c59aa7f4afa45e3f65a5d16a374b6cce26 and the latest commit such that it cannot be compiled on Alpine/musl anymore.

What is the output of running buildgo.sh on Alpine?

Trying with

FROM golang:1.13-alpine3.10 AS race
RUN apk add git g++ linux-headers
RUN git clone --single-branch --depth 1 https://llvm.org/git/compiler-rt.git
RUN cd compiler-rt/lib/tsan/go && ./buildgo.sh

I get the error:

gcc gotsan.cpp -c -o ./race_linux_amd64.syso -I../rtl -I../.. -I../../sanitizer_common -I../../../include -std=c++11 -Wall -fno-exceptions -fno-rtti -DSANITIZER_GO=1 -DSANITIZER_DEADLOCK_DETECTOR_VERSION=2 -fPIC -Wno-maybe-uninitialized -ffreestanding -Wno-unused-const-variable -Werror -Wno-unknown-warning-option -m64 -DSANITIZER_DEBUG=0 -O3 -fomit-frame-pointer -msse3
/usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: ./race_linux_amd64.syso: in function `__sanitizer::GetArgv()':
gotsan.cpp:(.text+0x42f3): undefined reference to `__libc_stack_end'
/usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: ./race_linux_amd64.syso: in function `__sanitizer::GetEnviron()':
gotsan.cpp:(.text+0x4303): undefined reference to `__libc_stack_end'
/usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: ./race_linux_amd64.syso: in function `__sanitizer::InternalAlloc(unsigned long, __sanitizer::SizeClassAllocator32LocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*, unsigned long)':
gotsan.cpp:(.text+0x555f): undefined reference to `__libc_malloc'
/usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: ./race_linux_amd64.syso: in function `__sanitizer::InternalRealloc(void*, unsigned long, __sanitizer::SizeClassAllocator32LocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*)':
gotsan.cpp:(.text+0x6d71): undefined reference to `__libc_realloc'
/usr/lib/gcc/x86_64-alpine-linux-musl/8.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: ./race_linux_amd64.syso: in function `__sanitizer::InternalFree(void*, __sanitizer::SizeClassAllocator32LocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*)':
gotsan.cpp:(.text+0x6df8): undefined reference to `__libc_free'
collect2: error: ld returned 1 exit status

Which seems to be a musl related issue.

For the reference to __libc_malloc, __libc_realloc, and __libc_free, I think the #ifdef chain near the top of lib/sanitizer_common/sanitizer_allocator.cpp needs to do something else on musl. But I don't know what consequences that might have for the hooks that the thread sanitizer wants to use for malloc, realloc, and free (lib/tsan/rtl/tsan_interceptors_posix.cpp).

The reference to __libc_stack_end may be more complicated. The sanitizer code seems to use the value of that variable to locate the arguments and environment passed to the program. See lib/sanitizer_common/sanitizer_linux.cpp. I think that code is using __libc_stack_end to find the arguments to main. The code is actually designed to work even if __libc_stack_end is not defined; it uses a weak reference, and checks whether that weak reference is satisfied. Unfortunately, that check is within #ifdef !SANITIZER_GO. I'm not sure why; it may have to do with having the race detector support internal linking. But even without that, the reference should be weak, so I'm not sure why the C linker is complaining. Perhaps SANITIZER_WEAK_ATTRIBUTE is somehow not defined on musl. And in fact I see now that it is not defined if SANITIZER_GO (lib/sanitizer_common/sanitizer_internal_defs.h). It's possible that the Go internal linker is better at weak definitions now than it was when the code was first written, so it may be worth trying just removing those #ifdefs.

With a small tweak to buildgo.sh to remove the (no longer valid) -Wno-unknown-warning-option from the compile flags and to add a -Wno-error=deprecated to permit deprecated warnings not to fail the build, this Dockerfile is sufficient to build a Go 1.13 on alpine 3.11 docker image (which the recent question was about):

FROM golang:1.13-alpine3.11 as golang

# Make go test -race work on alpine by building (patched) sanitizer manually
# as it is not built by default
# Ref: https://github.com/golang/go/issues/14481#issuecomment-281972886
# SHA: https://github.com/golang/go/blob/go1.13/src/runtime/race/README
ADD https://github.com/golang/go/files/3615484/0001-hack-to-make-Go-s-race-flag-work-on-Alpine.patch.gz /race.patch
RUN cd / \
    && apk add --no-cache --virtual .build-deps g++ git \
    && mkdir -p compiler-rt \
    && git clone https://git.llvm.org/git/compiler-rt.git \
    && cd compiler-rt \
    && git reset --hard fe2c72c59aa7f4afa45e3f65a5d16a374b6cce26 \
    && gzip -dc /race.patch | patch -p1 \
    && cd lib/tsan/go \
    && sed -e 's,-Wno-unknown-warning-option,-Wno-error=deprecated,' -i ./buildgo.sh \
    && ./buildgo.sh \
    && cp -v race_linux_amd64.syso /usr/local/go/src/runtime/race/ \
    && rm -rf /compiler-rt /race.patch \
    && apk del .build-deps

Did a bit of digging, it seems the master branch of compiler-rt contains the modifications described by the patch mentioned above.

Are you sure? I cloned latest compiler-rt and whilst the patch needed a couple of small changes before it would apply cleanly on master, it still seemed that it would be needed (hence your compile failures for glibc related issues)

Here's a version of the patch rebased against current upstream master (llvm-mirror/compiler-rt@69445f095) and with one additional !SANITIZER_GO added in sanitizer_common/sanitizer_linux_libcdep.cpp to preprocess out some new code there that relied on GetEnviron

0001-upstream-master-69445f095-hack-to-make-Go-s-race-flag-work-on-Alpine.patch.gz

Haha thanks a lot! I was actually adapting the patch right now. But you probably saved me minutes! 馃憤 馃巿

EDIT: A Dockerfile example thanks to dnwe's patch and Dockerfile, which works 馃帀

ARG ALPINE_VERSION=3.11
ARG GO_VERSION=1.13

FROM golang:${GO_VERSION}-alpine${ALPINE_VERSION} AS race
WORKDIR /tmp/race
RUN apk --update -q --progress --no-cache add git g++
RUN git clone --single-branch https://github.com/llvm-mirror/compiler-rt . && \
    git reset --hard 69445f095c22aac2388f939bedebf224a6efcdaf
RUN wget -q https://github.com/golang/go/files/4114545/0001-upstream-master-69445f095-hack-to-make-Go-s-race-flag-work-on-Alpine.patch.gz -O patch.gz && \
   gunzip patch.gz && \
   patch -p1 -i patch
WORKDIR /tmp/race/lib/tsan/go
RUN sed -e 's,-Wno-unknown-warning-option,-Wno-error=deprecated,' -i buildgo.sh
RUN ./buildgo.sh

FROM golang:${GO_VERSION}-alpine${ALPINE_VERSION}
COPY --from=race /tmp/race/lib/tsan/go/race_linux_amd64.syso /usr/local/go/src/runtime/race/race_linux_amd64.syso
# ... more instructions
RUN go test -race ./...

@qdm12 yeah so just to clarify for everyone watching, the Dockerfile I pasted above is compiling race from fe2c72c5 which is what the Go 1.13 release is using (as per runtime/race/README). The Dockerfile you've put there is patching race on current master (69445f09) which is good for exercising the patch that could _potentially_ be sent upstream, but isn't necessarily what I'd recommend people run their CI with.

The fix is submitted to llvm.
We now have 2 reasons to update race runtime: this and the chan synchronization change. @randall77

The last change to http://llvm.org/git/compiler-rt.git is from Oct 22, 2019. So I don't think it contains a recently submitted fix.
Is there somewhere else I should be pulling from? A branch other than master, perhaps?

llvm has moved to gihub: https://github.com/llvm/llvm-project

Current status:

linux/amd64 - working
darwin/amd64 - working
windows/amd64 - working
freebsd/amd64 - Build script fails with "race_freebsd_syso seems to link to libc". Well, yes. Why is that a problem?
netbsd/amd64 - "no space left on device". It can't git clone the llvm-project repository without running out of memory. It's ~2GB. I'll see if we can get a larger disk on the gomotes.
linux/ppc64le - killed during fetch of llvm-project. Also disk space?
linux/arm64 - download & build fine. Tests during race.bash crash randomly.

linux/arm64 - fixed some tests. Now dying because it runs out of OS threads when building various cmd/ binaries.

freebsd/amd64 - Build script fails with "race_freebsd_syso seems to link to libc". Well, yes. Why is that a problem?

This should help:
https://github.com/llvm/llvm-project/commit/0fb8a5356214c47bbb832e89fbb3da1c755eeb73

linux/arm64 - fixed some tests. Now dying because it runs out of OS threads when building various cmd/ binaries.

Perhaps a simplest fix is to export GOMAXPROCS= (or is there some other knob to reduce parallelism?).

linux/arm64: GOMAXPROCS=1 and GOGC=10 almost fixes it. Not quite. It still crashes trying to compile the tests for cmd/compile/internal/ssa.

I think we need a beefier (more ram) builder.

The .syso might be ok though. Maybe I can extract that early.

The .syso might be ok though. Maybe I can extract that early.

Good idea. We could save it earlier. It will be tested during the change presubmit again anyway.

freebsd/amd64 now working with the libc patch.
linux/arm64 also working, hacked around by extracting the .syso early

That just leaves netbsd/amd64 and linux/ppc64le with one definite and one probable disk space issue. I'm pursuing that, may take a while.
I'll prepare the 5 working platforms in a CL.

Change https://golang.org/cl/226981 mentions this issue: runtime/race: update some .syso files

I was able to download llvm-project on a ppc64le builder manually (after many attempts).
The project won't build:

gcc gotsan.cpp -c -o ./race_linux_ppc64le.syso -I../rtl -I../.. -I../../sanitizer_common -I../../../include -std=c++14 -Wall -fno-exceptions -fno-rtti -DSANITIZER_GO=1 -DSANITIZER_DEADLOCK_DETECTOR_VERSION=2 -fPIC -Wno-maybe-uninitialized -m64 -DSANITIZER_DEBUG=0 -O3 -fomit-frame-pointer -mcpu=power8 -fno-function-sections
/usr/bin/ld: ./race_linux_ppc64le.syso: in function `__sanitizer::CheckASLR()':
gotsan.cpp:(.text+0x1cb1c): undefined reference to `__sanitizer::ReExec()'

The call to ReExec in question is guarded by

#elif SANITIZER_PPC64V2
  // Disable ASLR for Linux PPC64LE.

Did this ever work? The change that added this code 78f7a6eaa601 was submitted in 2018.

The ReExec in compiler-rt/lib/sanitizer_common/sanitizer_linux_libcdep.cpp has a !SANITIZER_GO guard around it. If I remove that guard, ReExec is now found, but then it has other dependencies (GetArgv, GetEnviron) which are then undefined.

Any idea how to make progress here?

Humm... how did we get the previous syso files for ppc?...

Generally the way to get rid of these is to #ifdef !SANITIZER_GO CheckASLR. I don't think we need that for Go, but we inherit everything for native world by default...
If there would be a way to strip an object file from everything that's not reachable from few interface functions, that would be nice and probably will make syso files even smaller. But I don't know how to do this easily.

Every once in a while I am thinking about writing a new tsan runtime fork in Go for Go. There is also a next gen algorithm that I think could reduce memory consumption few times and make it faster and remove restriction on number of goroutines and all the nice things. But I can never fully convince myself that it's the right thing to do... or maybe just find time.

Humm... how did we get the previous syso files for ppc?...

Something changed since last fall. It wasn't that code itself, maybe someone turned ASLR on for ppc64le in some config somewhere?

Every once in a while I am thinking about writing a new tsan runtime fork in Go for Go. There is also a next gen algorithm that I think could reduce memory consumption few times and make it faster and remove restriction on number of goroutines and all the nice things. But I can never fully convince myself that it's the right thing to do... or maybe just find time.

It would be nice. Pulling new tsan always seems to be a headache. But rewriting tsan in Go is probably a lot of work, many times the aforementioned headache. Probably not worth it.
Unless you were locked in a room with nothing else to do for a month or so...

Something changed since last fall. It wasn't that code itself, maybe someone turned ASLR on for ppc64le in some config somewhere?

Humm... yes, it seems this bit of code may actually be needed:

#elif SANITIZER_PPC64V2
  // Disable ASLR for Linux PPC64LE.
  int old_personality = personality(0xffffffff);
  if (old_personality != -1 && (old_personality & ADDR_NO_RANDOMIZE) == 0) {
    VReport(1, "WARNING: Program is being run with address space layout "
               "randomization (ASLR) enabled which prevents the thread and "
               "memory sanitizers from working on powerpc64le.\n"
               "ASLR will be disabled and the program re-executed.\n");
    CHECK_NE(personality(old_personality | ADDR_NO_RANDOMIZE), -1);
    ReExec();
  }

Pulling in ReExec may transitively pull in lots of other other things (as you noted already).

Maybe we could do what freebsd does above if SANITIZER_GO to at least warn users:

  if (UNLIKELY(paxflags & CTL_PROC_PAXFLAGS_ASLR)) {
    Printf("This sanitizer is not compatible with enabled ASLR\n");
    Die();
  }

But I don't know if users can actually turn ASLR off if needed.

Potentially ASLR does not affect static Go binaries at all (?). But maybe it's a problem with cgo.

Hard to say what's best without testing...

It would be nice. Pulling new tsan always seems to be a headache. But rewriting tsan in Go is probably a lot of work, many times the aforementioned headache. Probably not worth it.
Unless you were locked in a room with nothing else to do for a month or so...

Yes, that's what I thought as well.
I am actually locked in a room at the moment, but not with nothing else to do :)

What is the next-gen algorithm?

What is the next-gen algorithm?

It's not described anywhere.

Change https://golang.org/cl/227867 mentions this issue: runtime/race: rebuild netbsd .syso

Hello all, a big thank you first of all for the work fixing this issue.

Do you have an approximate time when the Go Docker image will contain the fix? I just tried go test -race using the latest golang:1.14-alpine3.11 and it still fails. Thanks!

Or, if it will take quite some time, is there a way for us to manually build and bundle something in the docker image to have it fixed using your latest changes?

@qdm12 this was merged in master, so it will be part of the upcoming Go 1.15 release due in July, unless it gets reverted.

Only important bug fixes are generally backported to stable Go releases. See https://github.com/golang/go/wiki/MinorReleases.

In particular, while this change is awesome and I'd also like to use it today, the change is probably too invasive and risky to backport to a release like 1.14.3. The first 1.15 beta should be out in just six weeks, so you could try the docker images that come out with that release, too.

@qdm12 We are using

#!/bin/sh
set -eu
set -x

cd /

apk --no-cache add build-base git

# Clone the repo and reset to the right commit
git clone https://git.llvm.org/git/compiler-rt.git /x
cd /x

#https://github.com/golang/go/blob/release-branch.go1.14/src/runtime/race/README
git checkout 810ae8ddac890a6613d814c0b5415c7fcb7f5cca

# Apply the patch and do the build
patch -p1 <<'EOF'
commit 51e2fa48690be675363fbd2b62d775715749f58d
Author: Tomas Volf <[email protected]>
Date:   Thu Mar 5 23:43:38 2020 +0100

    Ported musl compatibility patch

diff --git a/lib/sanitizer_common/sanitizer_allocator.cpp b/lib/sanitizer_common/sanitizer_allocator.cpp
index 8d07906cc..e4681f77e 100644
--- a/lib/sanitizer_common/sanitizer_allocator.cpp
+++ b/lib/sanitizer_common/sanitizer_allocator.cpp
@@ -25,7 +25,7 @@ const char *PrimaryAllocatorName = "SizeClassAllocator";
 const char *SecondaryAllocatorName = "LargeMmapAllocator";

 // ThreadSanitizer for Go uses libc malloc/free.
-#if SANITIZER_GO || defined(SANITIZER_USE_MALLOC)
+#if defined(SANITIZER_USE_MALLOC)
 # if SANITIZER_LINUX && !SANITIZER_ANDROID
 extern "C" void *__libc_malloc(uptr size);
 #  if !SANITIZER_GO
diff --git a/lib/sanitizer_common/sanitizer_common.cpp b/lib/sanitizer_common/sanitizer_common.cpp
index f5f9f49d8..87efda5bd 100644
--- a/lib/sanitizer_common/sanitizer_common.cpp
+++ b/lib/sanitizer_common/sanitizer_common.cpp
@@ -274,6 +274,7 @@ uptr ReadBinaryNameCached(/*out*/char *buf, uptr buf_len) {
   return name_len;
 }

+#if !SANITIZER_GO
 void PrintCmdline() {
   char **argv = GetArgv();
   if (!argv) return;
@@ -282,6 +283,7 @@ void PrintCmdline() {
     Printf("%s ", argv[i]);
   Printf("\n\n");
 }
+#endif

 // Malloc hooks.
 static const int kMaxMallocFreeHooks = 5;
diff --git a/lib/sanitizer_common/sanitizer_linux.cpp b/lib/sanitizer_common/sanitizer_linux.cpp
index 0b53da6c3..ac3152eb8 100644
--- a/lib/sanitizer_common/sanitizer_linux.cpp
+++ b/lib/sanitizer_common/sanitizer_linux.cpp
@@ -26,7 +26,7 @@
 #include "sanitizer_placement_new.h"
 #include "sanitizer_procmaps.h"

-#if SANITIZER_LINUX
+#if SANITIZER_LINUX && !SANITIZER_GO
 #include <asm/param.h>
 #endif

@@ -549,7 +549,7 @@ const char *GetEnv(const char *name) {
 #endif
 }

-#if !SANITIZER_FREEBSD && !SANITIZER_NETBSD && !SANITIZER_OPENBSD
+#if !SANITIZER_FREEBSD && !SANITIZER_NETBSD && !SANITIZER_OPENBSD && !SANITIZER_GO
 extern "C" {
 SANITIZER_WEAK_ATTRIBUTE extern void *__libc_stack_end;
 }
@@ -581,7 +581,7 @@ static void ReadNullSepFileToArray(const char *path, char ***arr,
 }
 #endif

-#if !SANITIZER_OPENBSD
+#if !SANITIZER_OPENBSD && !SANITIZER_GO
 static void GetArgsAndEnv(char ***argv, char ***envp) {
 #if SANITIZER_FREEBSD
   // On FreeBSD, retrieving the argument and environment arrays is done via the
@@ -1060,7 +1060,7 @@ uptr GetMaxUserVirtualAddress() {

 #if !SANITIZER_ANDROID
 uptr GetPageSize() {
-#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__i386__))
+#if SANITIZER_LINUX && defined(EXEC_PAGESIZE)
   return EXEC_PAGESIZE;
 #elif SANITIZER_FREEBSD || SANITIZER_NETBSD
 // Use sysctl as sysconf can trigger interceptors internally.
diff --git a/lib/sanitizer_common/sanitizer_linux_libcdep.cpp b/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
index cd5037182..4e5c04818 100644
--- a/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
+++ b/lib/sanitizer_common/sanitizer_linux_libcdep.cpp
@@ -808,7 +808,7 @@ u64 MonotonicNanoTime() {
 }
 #endif  // SANITIZER_LINUX && !SANITIZER_GO

-#if !SANITIZER_OPENBSD
+#if !SANITIZER_OPENBSD && !SANITIZER_GO
 void ReExec() {
   const char *pathname = "/proc/self/exe";

diff --git a/lib/tsan/rtl/tsan_platform_linux.cpp b/lib/tsan/rtl/tsan_platform_linux.cpp
index 33fa586ca..4c871ec58 100644
--- a/lib/tsan/rtl/tsan_platform_linux.cpp
+++ b/lib/tsan/rtl/tsan_platform_linux.cpp
@@ -62,7 +62,7 @@
 # undef sa_sigaction
 #endif

-#if SANITIZER_FREEBSD
+#if SANITIZER_FREEBSD && !SANITIZER_GO
 extern "C" void *__libc_stack_end;
 void *__libc_stack_end = 0;
 #endif
EOF

cd lib/tsan/go
./buildgo.sh

install -Dt /out/usr/local/go/src/runtime/race/ race_linux_amd64.syso

and then copy it over the original one

COPY --from=compiler-rt       /out /

Or you can just wait for the 1.15 :)

Okay thanks for letting us know :+1: I noticed there is a golang:latest, is this based on the master branch? If so, is there one based on the master branch and alpine perhaps? Anyway, stability/production wise, I'd stick with the release in July :wink:

Change https://golang.org/cl/231222 mentions this issue: [release-branch.go1.14] runtime/race: update some .syso files

Change https://golang.org/cl/232417 mentions this issue: [release-branch.go1.14] runtime/race: rebuild netbsd .syso

@qdm12 this was merged in master, so it will be part of the upcoming Go 1.15 release due in July, unless it gets reverted.

Only important bug fixes are generally backported to stable Go releases. See https://github.com/golang/go/wiki/MinorReleases.

In particular, while this change is awesome and I'd also like to use it today, the change is probably too invasive and risky to backport to a release like 1.14.3. The first 1.15 beta should be out in just six weeks, so you could try the docker images that come out with that release, too.

Is this confirmed working in golang 1.15? I don't see anything in the release notes, and my google search didn't turn up anything helpful. Thanks!

Yes, we are using it successfully with the (rolling) 1.15-alpine tag.

I can confirm it works on golang:1.15-alpine, although not on golang:1.15-alpine3.12 for example.

Good job!

Was this page helpful?
0 / 5 - 0 ratings