-
Notifications
You must be signed in to change notification settings - Fork 13.4k
lld produces broken executable with CUDA #30572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you attach a reproduce file? Add -Wl,--reproduce,repro to your command line, then the linker will create repro.cpio containing all input files. The cpio is an uncompressed archive format, so please gzip before attaching. |
Realized that in order to see the bug, I had to run the executable, but I don't run an executable I downloaded from the internet. Could you attach source code? |
Well running random code off the internet is bad too :) Save https://gist.github.com/anonymous/855e277884eb6b388cd2f00d956c2fd4 axpy.cu which actually comes from http://llvm.org/docs/CompileCudaWithLLVM.html |
Thanks. I took a look at the source code as well as the object file but didn't find anything apparently wrong. Investigating it more is probably hard for me because I don't have a build environment for CUDA. Do you think you can debug? |
It crashes in closed source cuda runtime code: Program received signal SIGSEGV, Segmentation fault. I am not sure how to debug that. |
Justin, can you help with this? |
I'm not familiar with cudart internals, but at first glance it looks like it may be related to global ctors that cudart uses for initialization. |
Not sure which revision fixed this but r305278 works fine now. Thanks! |
This bug is not fixed yet. Here is an example of a constructor like those used by CUDA that is not called when linked with LLD: https://gist.github.com/orivej/29b9834e4621f2c69bfddf0bfc1baa1f |
Can you provide a .tar created with --reproduce? |
--reproduce archive with libc6=2.23-0ubuntu9 on Ubuntu 16: https://s3.amazonaws.com/orivej/bugs/llvm/31224/inputs.tar.xz |
I have updated the example at https://gist.github.com/orivej/29b9834e4621f2c69bfddf0bfc1baa1f to be independent from libc. |
Here is how ld decided to put .ctors into .init_array: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46770 |
Here is the current version of cudart [1]; it uses .ctors in usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudart_static.a |
So issue happens because sample code assumes that .ctors will be placed into .init_array, I cannot call it a bug, that was implemented intentionally initially in LLD I think. And sample code relies on a specific implementation of bfd which is just different from LLD. I would be happy to work on this one if we deside we want to mimic bfd behavior here though. Should we ? |
It is true that for LLD this is rather a missing feature than a bug. However, it causes programs linked with LLD to fail, it is difficult to debug, and crtbegin.o normally assumes that linker moves .ctors into .init_array and does not handle them. When gcc was switching from .ctors to .init_array, they had to update the linker with the rationale that also binds future linkers such as LLD:
gold provides a toggle --ctors-in-init-array (default) / --no-ctors-in-init-array: http://manpages.ubuntu.com/manpages/precise/man1/ld.1.html |
I wonder why are you still using .ctors/.dtors. .{init,fini}_array were invented almost 20 years ago and pretty much everybody is using them now instead of .ctors/.dtors. I do not see a reason to choose .ctors/.dtors when creating something new for CUDA. |
This is not about my code: Nvidia ships static libraries without source code that use .ctors, see #30572 #c15 for an example. |
Implement --ctors-in-init-array |
Implement --ctors-in-init-array |
Implement --ctors-in-init-array |
Implement --ctors-in-init-array |
Implement --ctors-in-init-array |
I suggest to represent .init_array/.fini_array sections as synthetic: |
Orivej, What you are doing seems basically correct, but it wasn't written in lld-ish way. We have SyntheticSection data structure to represent virtual sections. Please take a look at this: https://reviews.llvm.org/D35509 This patch is incomplete. If you want me to finish this up, I'll do that for you. Or you can take it over. |
Do you know how to report a bug to nvidia on this? Even if lld does get support for converting .ctors to .init_array it would be good to try to drop it some time in the future. |
Looks like a wontfix. FWIW https://reviews.llvm.org/D71434 I changed clang to not use .ctors/.dtors on generic ELF platforms. |
Implement --ctors-in-init-array for LLD 11 |
mentioned in issue llvm/llvm-bugzilla-archive#44698 |
mentioned in issue llvm/llvm-bugzilla-archive#48096 |
Extended Description
havana ~/Downloads > clang-4.0 -v
openSUSE Linux clang version 4.0.0 (trunk 288322) (based on LLVM 4.0.0svn)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /suse/idoenmez/bin
Found candidate GCC installation: /usr/lib64/gcc/x86_64-suse-linux/6
Selected GCC installation: /usr/lib64/gcc/x86_64-suse-linux/6
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
Same works with gold (or bfd too):
Attached is the produced binary.
The text was updated successfully, but these errors were encountered: