Skip to content

Symbolication of system libraries is incorrect on macOS #318

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alexcrichton opened this issue May 8, 2020 · 5 comments
Closed

Symbolication of system libraries is incorrect on macOS #318

alexcrichton opened this issue May 8, 2020 · 5 comments
Labels
gimli Related to the gimli implementation

Comments

@alexcrichton
Copy link
Member

This is actually both an issue with libbacktrace and gimli today, but this program:

use backtrace::Backtrace;

fn main() {
    std::thread::spawn(|| {
        println!("{:#?}", Backtrace::new());
    })
    .join()
    .unwrap();
}

generates this for libbacktrace:

...
  16:     0x7fff6f853109 - _ZL12preoptimized

and this for gimli-symbolize:

...
  16:     0x7fff6f853109 - __pthread_keys

Both of these symbolications are incorrect. According to LLDB it should be _pthread_start.

I've been trying to debug this and I can't figure out what's going on. I believe this entirely has to do with looking up an address in the symbol table of an executable (not related to debuginfo). The address from the backtrace goes through some mappings to try to get an address to lookup in the symbol table.

The gimli impl correctly finds that it's placed in a segment of the libsystem_pthread.dylib system library, but the lookup in the symbol table effectively fails and __pthread_keys is just the closest symbol.

In terms of values we're looking up 0x7fff6f853109, but inside of libsystem_pthread.dylib the symbols are all "located" at very small addresses. The symbol table entry for _pthread_start is 0x6075. Additionally the "bias" or load address for libsystem_pthread.dylib is also small-ish, it's 0x8547000.

What I can't figure out is that bias + segment_addr + symbol_table_addr =~ actual_address. When we lookup in the symbol table, though, we're looking for actual_addres - bias, we don't factor in the segment's address. I'm not really sure why, but if we do indeed take it into account then all other symbols in the main executable don't resolve.

In any case I wanted to write this down as an issue. I suspect that something isn't being accounted for in the load commands or something like that. I've tried poring over the source of some dyld stuff but I'm not really coming up with much.

@alexcrichton alexcrichton added the gimli Related to the gimli implementation label May 8, 2020
@philipc
Copy link
Contributor

philipc commented May 9, 2020

Is this with the dladdr fallback removed?

@alexcrichton
Copy link
Member Author

Heh yes, but for the wrong reasons. As of today gimli doesn't even attempt to symbolize symbols from system libraries because, at least on my system, they're all fat libraries and that isn't supported.

So today this appears to work with gimli (running the above shows _pthread_start), but once gimli is updated to look inside of fat libraries it no longer works (because of this issue). That update for fat libraries is right now bundled into the dladdr fallback removal because the dladdr fallback remove otherwise fails tests.

@alexcrichton
Copy link
Member Author

alexcrichton commented May 9, 2020

Also, to write down some more of what I'm seeing.

Symbol location addr bias svma value in symbol table
Rust executable 0x102f8ca0d 0x2f08000 0x100000000 0x1000849e0
System Library 0x7fff6f853109 0x8547000 0x7fff67306000 0x6075

In both cases addr - bias - svma - symbol_table is about zero. [edit: this is wrong] In the first case we need to lookup addr - bias, and in the second case we need to lookup addr - bias - svma. I'm not sure how to know which to lookup.

Currently we lookup addr - bias which means that for system libraries we're getting the last element of the symbol table which happens to be __pthread_keys. No idea where libbacktrace is getting _ZL12preoptimized from.

@philipc
Copy link
Contributor

philipc commented May 9, 2020

This lldb code looks similar to the problem you are seeing.

alexcrichton added a commit that referenced this issue May 10, 2020
This commit fixes an issue where symbolication of system libraries
didn't work on macOS. Symbolication through the symbol table was always
off by a slide amount for the library. It's not entirely clear why this
kept happening or what was going on, but some poking in LLDB's source
revealed a way we can differentiate and figure out what addresses need
to be looked up in the symbol table. Some more information is contained
in the comments of the commit itself.

Closes #318
@alexcrichton
Copy link
Member Author

Ok given that LLDB code I've cooked up 6a30541 which I believe should fix this.

alexcrichton added a commit that referenced this issue May 10, 2020
This commit fixes an issue where symbolication of system libraries
didn't work on macOS. Symbolication through the symbol table was always
off by a slide amount for the library. It's not entirely clear why this
kept happening or what was going on, but some poking in LLDB's source
revealed a way we can differentiate and figure out what addresses need
to be looked up in the symbol table. Some more information is contained
in the comments of the commit itself.

Closes #318
alexcrichton added a commit that referenced this issue May 11, 2020
This commit fixes an issue where symbolication of system libraries
didn't work on macOS. Symbolication through the symbol table was always
off by a slide amount for the library. It's not entirely clear why this
kept happening or what was going on, but some poking in LLDB's source
revealed a way we can differentiate and figure out what addresses need
to be looked up in the symbol table. Some more information is contained
in the comments of the commit itself.

Closes #318
alexcrichton added a commit that referenced this issue May 12, 2020
This commit fixes an issue where symbolication of system libraries
didn't work on macOS. Symbolication through the symbol table was always
off by a slide amount for the library. It's not entirely clear why this
kept happening or what was going on, but some poking in LLDB's source
revealed a way we can differentiate and figure out what addresses need
to be looked up in the symbol table. Some more information is contained
in the comments of the commit itself.

Closes #318
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gimli Related to the gimli implementation
Projects
None yet
Development

No branches or pull requests

2 participants