Skip to content

misc/cgo: 1.7rc5 tests segfault on Clear Linux #16618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fenrus75 opened this issue Aug 5, 2016 · 28 comments
Closed

misc/cgo: 1.7rc5 tests segfault on Clear Linux #16618

fenrus75 opened this issue Aug 5, 2016 · 28 comments

Comments

@fenrus75
Copy link

fenrus75 commented Aug 5, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?

1.7rc5

  1. What operating system and processor architecture are you using (go env)?

root@arjan-box /var/lib/mock/clear-go/root/builddir # go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=""
GORACE=""
GOROOT="/usr/lib/golang"
GOTOOLDIR="/usr/lib/golang/pkg/tool/linux_amd64"
GO15VENDOREXPERIMENT="1"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"

Linux OS is Clear Linux (http://www.clearlinux.org)

  1. What did you do?

I'm the distro packager, trying to package 1.7rc5 (in prep for 1.7 release)

  1. What did you expect to see?
  2. What did you see instead?
##### misc/cgo/testcarchive
--- FAIL: TestInstall (0.41s)
        carchive_test.go:130: [gcc -fPIC -m64 -pthread -fmessage-length=0
-fdebug-prefix-map=/tmp/go-build375682752=/tmp/go-build
-gno-record-gcc-switches -I pkg/linux_amd64 -o testp main.c main_unix.c
pkg/linux_amd64/libgo.a]

        carchive_test.go:157:
        carchive_test.go:158: signal: segmentation fault (core dumped)
--- FAIL: TestEarlySignalHandler (0.32s)
        carchive_test.go:232:
        carchive_test.go:233: signal: segmentation fault (core dumped)
--- FAIL: TestSignalForwardingExternal (0.31s)
        carchive_test.go:350: Did not receive OK signal
--- FAIL: TestOsSignal (0.29s)
        carchive_test.go:406:
        carchive_test.go:407: signal: segmentation fault (core dumped)
--- FAIL: TestSigaltstack (0.34s)
        carchive_test.go:438:
        carchive_test.go:439: signal: segmentation fault (core dumped)
FAIL

any suggestions on diagnosing this welcome; it seems coredumps aren't happening really and finding any of the artifacts at the end of the all.bash script is proving hard

@bradfitz
Copy link
Contributor

bradfitz commented Aug 5, 2016

Which version of gcc?

/cc @ianlancetaylor

@bradfitz bradfitz changed the title 1.7rc5 segfaults in test suite (linux) misc/cgo: 1.7rc5 tests segfault on Clear Linux Aug 5, 2016
@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

gcc 6.1

On Fri, Aug 5, 2016 at 9:07 AM, Brad Fitzpatrick [email protected]
wrote:

Which version of gcc?

/cc @ianlancetaylor https://github.com/ianlancetaylor


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#16618 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABPeFcTOY0PSL41KoA4T5UpLXaOK_rzDks5qc1-lgaJpZM4Jdzcf
.

@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

Steps to reproduce

# run the build env in a docker container
docker run -it clearlinux

# install dev environment
swupd bundle-add os-core-dev go-basic

(one may need a few pathname fixes from https://github.com/clearlinux-pkgs/go due to /etc being only for the sysadmin and not for the OS to fill in)

@ianlancetaylor
Copy link
Contributor

The errors you are reporting are in misc/cgo/testcarchive. You can run just those tests by going to that directory and running go test carchive_test.go.

Do you see any other test failures?

@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

the go test will run the OS go, not the newly built one (and since the
build fails that won't get installed)

On Fri, Aug 5, 2016 at 9:46 AM, Ian Lance Taylor [email protected]
wrote:

The errors you are reporting are in misc/cgo/testcarchive. You can run
just those tests by going to that directory and running go test
carchive_test.go.

Do you see any other test failures?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#16618 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABPeFWVLwCe5cfMkPrDAcTE_Pfdbfyz4ks5qc2jxgaJpZM4Jdzcf
.

@ianlancetaylor
Copy link
Contributor

I have no idea how this works in Docker.

At the point of that test failure, Go has already been installed and be used independently. Yes, you would have to use the newly installed Go, not the system go.

Do you see any other test failures?

@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

I'll poke some more to get better isolation on this

On Fri, Aug 5, 2016 at 10:26 AM, Ian Lance Taylor [email protected]
wrote:

I have no idea how this works in Docker.

At the point of that test failure, Go has already been installed and be
used independently. Yes, you would have to use the newly installed Go, not
the system go.

Do you see any other test failures?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#16618 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABPeFc8rEQmiSCC-n5WUCHNk45GP27Ilks5qc3JYgaJpZM4Jdzcf
.

@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

managed to get the testsuite not delete the test binary...

#0 runtime.raise () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:110
110 RET
Loading Go Runtime support.
(gdb) bt
#0 runtime.raise () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:110
#1 0x0000000000437dd0 in runtime.dieFromSignal (sig=11) at /usr/lib/golang/src/runtime/signal1_unix.go:195
#2 0x0000000000438439 in runtime.sigfwdgo (sig=11, info=0x7ffc6fe29070, ctx=0x7ffc6fe28f40, ~r3=216) at /usr/lib/golang/src/runtime/signal2_unix.go:32
#3 0x000000000043962f in runtime.sigtrampgo (sig=11, info=0x7ffc6fe29070, ctx=0x7ffc6fe28f40) at /usr/lib/golang/src/runtime/signal_sigtramp.go:16
#4 0x0000000000454275 in runtime.sigtramp () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:234
#5 0x0000000000454340 in ?? () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:311
#6 0x0000000000000007 in ?? ()
#7 0x0000000000000000 in ?? ()

@ianlancetaylor
Copy link
Contributor

Thanks. That stack trace implies that we received a signal when the runtime package variable signalsOK was false, which implies that it occurred before the Go runtime was initialized. Can you find out what caused the SIGSEGV?

@randall77
Copy link
Contributor

This could be #16590 . Try to patch in the CL from that issue and see if it helps.
@crawshaw

@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

I'll poke at the patch in that issue.

Note that running the test app in gdb directly (rather than pointing it at a core dump) leads to a crash in an SSE/AVX instruction in pthreads where it appears the stack is not 16 byte aligned properly (but only 8 byte aligned), while the pthread code assumes it gets called with a properly aligned stack...

who owns aligning the initial stack at startup.. is that a go low level thing or is that even before any go code runs?

@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

(gdb) bt
#0 0x00007ffff7fa5f56 in pthread_create () from /usr/lib64/libpthread.so.0
#1 0x0000000000401896 in x_cgo_sys_thread_create (func=, arg=) at /builddir/build/BUILD/go/src/runtime/cgo/gcc_libinit.c:24
#2 0x0000000000453eda in _rt0_amd64_linux_lib () at /usr/lib/golang/src/runtime/rt0_linux_amd64.s:36
#3 0x000000000045811b in __libc_csu_init ()
#4 0x00007ffff7df72b0 in __libc_start_main () from /usr/lib64/libc.so.6
#5 0x000000000040100a in _start () at ../sysdeps/x86_64/start.S:120

@minux
Copy link
Member

minux commented Aug 5, 2016 via email

@minux
Copy link
Member

minux commented Aug 5, 2016 via email

@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

(gdb) p/x $rsp
$1 = 0x7fffffffe318

@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

Dump of assembler code for function _rt0_amd64_linux_lib:
=> 0x0000000000453e80 <+0>: sub $0x50,%rsp
0x0000000000453e84 <+4>: mov %rbp,0x48(%rsp)
0x0000000000453e89 <+9>: lea 0x48(%rsp),%rbp
0x0000000000453e8e <+14>: mov %rbx,0x10(%rsp)
0x0000000000453e93 <+19>: mov %rbp,0x18(%rsp)

@minux
Copy link
Member

minux commented Aug 5, 2016 via email

@ianlancetaylor
Copy link
Contributor

No, I think it is our fault. It is normal for the stack to be aligned before the call instruction, and thus at an 8-byte offset at the start of the function.

@ianlancetaylor
Copy link
Contributor

Perhaps broken by the change to enable frame pointers by default.

@minux
Copy link
Member

minux commented Aug 5, 2016 via email

@ianlancetaylor
Copy link
Contributor

Ahh, then the solution is simply increase the frame size of _rt0_amd64_linux_lib by 8?

That would be my guess, yes. Why nobody else is seeing this I do not know.

@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

as to why others don't see this... clear linux is rather aggressive on the use of sse/avx...

@minux
Copy link
Member

minux commented Aug 5, 2016 via email

@minux
Copy link
Member

minux commented Aug 5, 2016 via email

@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

sure testing now

@minux
Copy link
Member

minux commented Aug 5, 2016 via email

@fenrus75
Copy link
Author

fenrus75 commented Aug 5, 2016

with that patch

misc/cgo/testcarchive

PASS

and the build completes correctly/succesfully.
so thumbs-up from my side on the patch

@ianlancetaylor
Copy link
Contributor

I think people who set GOEXPERIMENT are permitted to fail.

madeye pushed a commit to shadowsocks/go that referenced this issue Aug 10, 2016
…inux_lib

Fixes golang#16618.

Change-Id: Iffada12e8672bbdbcf2e787782c497e2c45701b1
Reviewed-on: https://go-review.googlesource.com/25550
Run-TryBot: Minux Ma <[email protected]>
Reviewed-by: Arjan Van De Ven <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
@golang golang locked and limited conversation to collaborators Aug 5, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants