Skip to content

Not joining all threads = memory leaked on process exit #759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
faern opened this issue Apr 29, 2020 · 8 comments
Open

Not joining all threads = memory leaked on process exit #759

faern opened this issue Apr 29, 2020 · 8 comments
Labels
enhancement New feature or request

Comments

@faern
Copy link

faern commented Apr 29, 2020

Hi. I'm implementing a library at a low level which allocates and frees memory manually. To try to keep it correct I run example binaries and tests under Valgrind a lot. However, when using async-std my programs and tests very often seem to not correctly join threads before exiting, resulting in leaks that make Valgrind sad and my automated tests fail. I experience this on Linux and have not tried it elsewhere.

A simple failing test. Placed under tests/all_tests.rs:

#[async_std::test]
async fn just_sleep() {
    async_std::task::sleep(std::time::Duration::from_millis(1)).await;
}

You can run it under vanilla Valgrind with valgrind --leak-check=full ./target/debug/whatever_the_filename_is. But cargo valgrind has prettier output. Due to the racy nature of this it might not always fail, but for me it does ~90% of the runs:

$ cargo valgrind --test all_tests
....
       Error Leaked 24 B
        Info at malloc (vg_replace_malloc.c:309)
             at alloc::alloc::alloc (alloc.rs:84)
             at alloc::alloc::exchange_malloc (alloc.rs:206)
             at alloc::sync::Arc<T>::new (sync.rs:302)
             at futures_timer::global::current_thread_waker (global.rs:104)
             at futures_timer::global::run (global.rs:59)
             at futures_timer::global::HelperThread::new::{{closure}} (global.rs:28)
             at std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:126)
             at std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}} (mod.rs:470)
             at <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (panic.rs:315)
             at std::panicking::try::do_call (panicking.rs:292)
             at __rust_maybe_catch_panic (lib.rs:80)
       Error Leaked 288 B
        Info at calloc (vg_replace_malloc.c:762)
             at allocate_dtv
             at _dl_allocate_tls
             at pthread_create@@GLIBC_2.2.5
             at std::sys::unix::thread::Thread::new (thread.rs:67)
             at std::thread::Builder::spawn_unchecked (mod.rs:489)
             at std::thread::Builder::spawn (mod.rs:382)
             at futures_timer::global::HelperThread::new (global.rs:26)
             at <futures_timer::timer::TimerHandle as core::default::Default>::default (timer.rs:281)
             at futures_timer::delay::Delay::new (delay.rs:39)
             at async_std::io::timeout::timeout::{{closure}} (timeout.rs:40)
             at <std::future::GenFuture<T> as core::future::future::Future>::poll::{{closure}} (future.rs:43)
     Summary Leaked 312 B total

A very similar leak stack trace can be obtained from the following test:

#[test]
fn just_sync_sleep() {
    std::thread::spawn(move || {
        std::thread::sleep(std::time::Duration::from_millis(1));
    });
}

Which is fixed by joining all threads before exiting:

#[test]
fn just_sync_sleep() {
    let t = std::thread::spawn(move || {
        std::thread::sleep(std::time::Duration::from_millis(1));
    });
    t.join()
}

This leads me to suspect the #[async_std::main] and #[async_std::test] macros don't properly join all threads before exiting.

@dignifiedquire
Copy link
Member

Thank you for the report, two asks
(1) could you test this with #757
(2) is there an easy way for me to run the tests that you are experencing this with?

@dignifiedquire dignifiedquire added the bug Something isn't working label May 2, 2020
@dignifiedquire
Copy link
Member

I just realize, that this does not happen in #757 either, so no need to test this. Shutting down the threads could be possibly added to the macros, but it would have to be a manual call for other use cases, as we create a global runtime, and so do not get the benefit of using Drop like tokio does, where the user manages a local runtime.

@dignifiedquire dignifiedquire added enhancement New feature or request and removed bug Something isn't working labels May 2, 2020
@faern
Copy link
Author

faern commented May 3, 2020

Yes, #757 does not solve the leak.

I created a repository to easily test this and to see some differences: https://github.com/faern/leakruntime. This repository currently points to the smol branch of async-std.

One interesting thing I noted is that #[tokio::main] does actually cause a single thread leak sometimes. And #[tokio::test] as well, just very rarely. I had to run cargo valgrind on my example repo in a loop to eventually trigger it:

while cargo +1.39.0 valgrind --test tokio --features tokio; do true; done

async-std on the other hand leak many objects both from #[async_std::main] and #[async_std::test] every time. The results I got can be viewed here: https://raw.githubusercontent.com/faern/leakruntime/master/async-std-main-leak.log and https://raw.githubusercontent.com/faern/leakruntime/master/async-std-test-leak.log

@faern
Copy link
Author

faern commented May 3, 2020

Yes the difference in what the macros expand to is shown in the main.rs file in that repo. The main difference being that tokio creates a Runtime blocks on it and then drops it. While async-std just blocks on a global runtime. And I assume nothing is responsible for shutting that runtime down before the process exits.

@faern
Copy link
Author

faern commented May 3, 2020

The very rare leak found in #[tokio::test] does exist for a normal sync test looking like this as well:

#[test]
fn just_sleep() {
    std::thread::sleep(std::time::Duration::from_millis(1));
}

So that one can't be blamed on tokio. One just have to run it sufficiently many times to hit it. I think I will report that directly to Rust. But it's unrelated to all the leaks this issue is about.

@Matthias247
Copy link

async-std spawns global runtime threads for executors, timers and potentially IO in the background which will simply run forever. So the leak seems kind of by design.

@faern
Copy link
Author

faern commented May 5, 2020

Since it's possible to interact with the global runtime via for example async_std::task::spawn I guess there could just as well be a async_std::shutdown function that cancels and frees all resources of the global runtime. It could then expand the #[async_std::main] macro to end main with this call. For tests it would be harder since there can be multiple tests. But doable I think

I get that there are a lot of users who don't care about this, and the OS will clean up stuff as the process exits anyway. So it's up to you if you prioritize this at all. I just wanted to report my findings.

@hu55a1n1
Copy link

hu55a1n1 commented Jun 25, 2020

I seem to have the same issue, but I didn't have it in v1.5.0. It only shows up in v1.6.2. FWIW, also happens if I set RUST_TEST_THREADS=1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants