Skip to content

FFI to C segmentation fault (SIGSEGV: invalid memory reference) #65546

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
parisholley opened this issue Oct 18, 2019 · 7 comments
Closed

FFI to C segmentation fault (SIGSEGV: invalid memory reference) #65546

parisholley opened this issue Oct 18, 2019 · 7 comments
Labels
A-FFI Area: Foreign function interface (FFI) E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@parisholley
Copy link

parisholley commented Oct 18, 2019

creating a new issue since #59202 is closed, but may be related.

error: process didn't exit successfully: `/private/tmp/Device-Detection/rust/target/debug/deps/51degrees-91c4ea4c111d2f7e pattern_init --nocapture` (signal: 11, SIGSEGV: invalid memory reference)

you can check out a fork of a c library I am trying to build a wrapper for. and this error is strange, because i commented out all of my C calls that pass the CString as an arg, so i'm not even sending it anywhere, and can get it to segfault repeatedly.

On Mac:

git clone -n [email protected]:parisholley/Device-Detection.git
cd Device-Detection
git checkout rust
git lfs fetch
git lfs checkout
cd rust
cargo test --features pattern pattern_init -- --nocapture

On Ubuntu:

docker run -it ubuntu /bin/bash
apt-get update
apt-get install git git-lfs curl build-essential

git clone -n https://github.com/parisholley/Device-Detection.git
cd Device-Detection
git checkout rust
git lfs fetch
git lfs checkout
cd rust

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env
rustup default nightly
cargo test --features pattern pattern_init -- --nocapture

a few things are interesting though, notice these two lines:

https://github.com/parisholley/Device-Detection/blob/7d04b4a99526bff53e35843c9a2347a4b955d110/rust/pattern.rs#L127-L132

If i comment out line 131, and only call the detect function once, there is no problem. it is only when called a second time, that SIGSEGV is thrown. I have tried using both into_raw() and as_ptr() with the same results.

what makes this EVEN more strange, is if in that same function, i comment out 78 and 111, so that only 74/75 (CString) is live, no segfault. if i do the opposite, and comment out 74/75, but leave 78/111 live, no segfault. it is only when the two, seemingly unrelated calls are in the same function, that things go crazy.

my environment:

rustc 1.40.0-nightly (1721c9685 2019-10-12)
mac os 10.14.6
llvm 9.0.0

Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Target: x86_64-apple-darwin18.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

i'll try to work on creating a smaller reproduce case

@parisholley parisholley changed the title FFI to C segmentation fault FFI to C segmentation fault (SIGSEGV: invalid memory reference) Oct 18, 2019
@mati865
Copy link
Member

mati865 commented Oct 18, 2019

@parisholley are you able to obtain backtrace with gdb or lldb?

@Stargateur
Copy link
Contributor

I bet you corrupt your stack in fiftyoneDegreesProviderWorksetGet or fiftyoneDegreesWorksetRelease what are they ? I bet self.provider is not a valid.

"I have tried using both into_raw() and as_ptr() with the same results." prove you have no idea what you are doing. Also, it's very hard to follow you, you say if comment x line but not y line blabla, it's not a good way to describe a bug.

@Mark-Simulacrum Mark-Simulacrum added the E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example label Oct 18, 2019
@parisholley
Copy link
Author

parisholley commented Oct 18, 2019

tl;dr, after some sleep and a fresh look, it ended up being due to a NULL returned from the C library on the second invocation of fiftyoneDegreesProviderWorksetGet (which explains the error), however, it still does not explain why I was able to change the SEGFAULT behavior through commenting out other lines.

@Stargateur I am aware of the distinction between the two in terms of mutability and that as_ptr is all I should need, I am just offering perspective on trial/error of everything I have tried, given that I do not understand what Rust is doing as a compiler under the hood.

Nothing about your comment was respectful or helpful and I gave two ways to reproduce this problem all within about 30 seconds, so instead of "comment debugging" and assuming stupidity on the side of the contributor, actually have a productive conversation.

I will re-iterate, if you comment out lines in the code I provided, that I have clearly linked to, I am giving you multiple ways to alter the occurrence of SIGSEGV (which I know why is happening now).

Unless I am missing something, I would expect, given the NULL value returned from fiftyoneDegreesProviderWorksetGet, that both of these implementations would SIGSEGV the exact same way:

pub fn detect(&mut self, user_agent:&str){
    let agent = CString::new(user_agent).expect("CString::new failed");
    let agent_raw = agent.as_ptr();

    unsafe {
        let ws = fiftyoneDegreesProviderWorksetGet(&mut self.provider);

        fiftyoneDegreesWorksetRelease(ws);
    }
}
pub fn detect(&mut self, user_agent:&str){
    //let agent = CString::new(user_agent).expect("CString::new failed");
    //let agent_raw = agent.as_ptr();

    unsafe {
        let ws = fiftyoneDegreesProviderWorksetGet(&mut self.provider);

        fiftyoneDegreesWorksetRelease(ws);
    }
}

but in fact, the second one has 0 problems, which makes no sense, as nothing about agent/agent_raw is related to the code in the unsafe block. so understandably, when trying to debug this, I am being sent in the wrong direction and thinking it is RUST, and not the C library.

@mati865

I didn't have much lucking re-running the tests via GDB (or when reproducing the same logic in a bin), it appears to just hang after spawning the threads:

sudo rust-gdb /private/tmp/Device-Detection/rust/target/debug/deps/51degrees-91c4ea4c111d2f7e
Reading symbols from /private/tmp/Device-Detection/rust/target/debug/deps/51degrees-91c4ea4c111d2f7e...
Reading symbols from /private/tmp/Device-Detection/rust/target/debug/deps/51degrees-91c4ea4c111d2f7e.dSYM/Contents/Resources/DWARF/51degrees-91c4ea4c111d2f7e...
(gdb) run --test
Starting program: /private/tmp/Device-Detection/rust/target/debug/deps/51degrees-91c4ea4c111d2f7e --test
Note: this version of macOS has System Integrity Protection.
Because `startup-with-shell' is enabled, gdb has worked around this by
caching a copy of your shell.  The shell used by "run" is now:
    /Users/pholley/Library/Caches/gdb/bin/sh
[New Thread 0xd03 of process 48298]
[New Thread 0xe03 of process 48298]

I will also note, i can reproduce this via test or bin, release or debug...

@parisholley
Copy link
Author

the only other thing I can think of... is the FFI call, which fails due to the NULL pointer, is preventing rust from cleaning things up. as an example, if I decided to use into_raw (which isn't what I will do in my final implementation, I will use as_ptr), it is possible (based on what I read in documentation) that I need to "retake the pointer" for agent_raw. so assuming there is still a NULL bug in my FFI calls, I believe doing this implementation would clean up Rust's memory usage and prevent wrapping of errors (though very possible i'm not calling the right order of forget and/or drop).

pub fn detect(&mut self, user_agent:&str){
    let agent = CString::new(user_agent).expect("CString::new failed");
    let agent_raw = agent.into_raw();
    forget(agent_raw); // not sure this helps this example, but i tried with it here and commented out

    unsafe {
        let cstring = CString::from_raw(agent_raw);
        drop(agent_raw); // make sure that rust doesn't try to do something with this when FFI panics

        let ws = fiftyoneDegreesProviderWorksetGet(&mut self.provider);

        fiftyoneDegreesWorksetRelease(ws);
    }
}

but I still get the same error... so i'm not sure what about creating a Cstring ptr, allows the FFI call error to bubble up... but commenting it all out does not...

@Centril Centril added A-FFI Area: Foreign function interface (FFI) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 18, 2019
@parisholley
Copy link
Author

another interesting thing I have found while trying to debug, when the FFI call at issue returns a null

fiftyoneDegreesWorkset *ws = NULL;
if (pool->available > 0) {
    // Worksets are available. Take one from the end of the array.
    ws = pool->worksets[pool->available - 1];
    pool->available--;
}
return ws;

this implementation, has no SIGSEGV

pub fn detect(&mut self, user_agent:&str){
    unsafe {
        let ws = fiftyoneDegreesProviderWorksetGet(&mut self.provider);

        fiftyoneDegreesWorksetRelease(ws);
    }
}

if I alter the C library to put printf in for debugging purposes, it does SIGSEGV:

fiftyoneDegreesWorkset *ws = NULL;
if (pool->available > 0) {
    // Worksets are available. Take one from the end of the array.
    ws = pool->worksets[pool->available - 1];
    pool->available--;
    printf("available");
}
printf("unavailable");
return ws;

@jonas-schievink
Copy link
Contributor

Like in #65567, you move the provider after it is being initialized. The C library stores pointers to that struct, creating a dangling pointer. Closing since this is not a Rust bug.

@Stargateur
Copy link
Contributor

@parisholley pardon my rudeness, I sux in english so I take shortcut when I'm writing, you should not be offended because I said you have no idea what you are doing, for example, ask me to code in javascript, I will have no idea of what I'm doing. I said that because you seem so sure of you in your issue. Whereas you should have been very suspicious of your code. That a classic beginner mistake to feel they understand everything. into_raw() and as_ptr() have totally different implication, not only about the mutability.It's about onwership. The fact you try to swap them, give me a hint that you probably doing something wrong somewhere. Then, I look a little at the code, and make a bet based on my experience of debugging in C. It's 99% of the time not a bug but a human mistake. As you can see I was right, your argument was invalid so I don't think my comment was "unhelpful".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-FFI Area: Foreign function interface (FFI) E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants