-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[MLIR] Fix use-after-frees when accessing DistinctAttr storage #148666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLIR] Fix use-after-frees when accessing DistinctAttr storage #148666
Conversation
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-core Author: Artemiy Bulavin (abulavin) ChangesThis PR fixes a use-after-free error that happens when This PR introduces a way to switch the allocation strategy for Problem Details:The Solution:The
To switch between these modes an
Testing:A C++ unit test has been added to validate this fix. (I was previously reproducing this failure with Note: This is a 2nd attempt at my previous PR #128566 that was reverted in #132935. I believe I've addressed the TSAN and race condition concerns. Full diff: https://github.com/llvm/llvm-project/pull/148666.diff 8 Files Affected:
diff --git a/mlir/include/mlir/IR/MLIRContext.h b/mlir/include/mlir/IR/MLIRContext.h
index ef8dab87f131a..a4ac3ab08b53c 100644
--- a/mlir/include/mlir/IR/MLIRContext.h
+++ b/mlir/include/mlir/IR/MLIRContext.h
@@ -153,6 +153,13 @@ class MLIRContext {
disableMultithreading(!enable);
}
+ /// Set the flag specifying if thread-local storage should be used by
+ /// attribute storage allocators in this context.
+ void disableThreadLocalStorage(bool disable = true);
+ void enableThreadLocalStorage(bool enable = true) {
+ disableThreadLocalStorage(!enable);
+ }
+
/// Set a new thread pool to be used in this context. This method requires
/// that multithreading is disabled for this context prior to the call. This
/// allows to share a thread pool across multiple contexts, as well as
diff --git a/mlir/include/mlir/Pass/PassManager.h b/mlir/include/mlir/Pass/PassManager.h
index 6e59b0f32ac6f..e6260415a83e1 100644
--- a/mlir/include/mlir/Pass/PassManager.h
+++ b/mlir/include/mlir/Pass/PassManager.h
@@ -17,6 +17,7 @@
#include "llvm/Support/raw_ostream.h"
#include <functional>
+#include <mutex>
#include <optional>
namespace mlir {
@@ -497,6 +498,10 @@ class PassManager : public OpPassManager {
/// A flag that indicates if the IR should be verified in between passes.
bool verifyPasses : 1;
+
+ /// A mutex used to serialixe thread access when running the pass manager with
+ /// crash reproduction enabled.
+ std::mutex crashRecoveryLock;
};
/// Register a set of useful command-line options that can be used to configure
diff --git a/mlir/lib/IR/AttributeDetail.h b/mlir/lib/IR/AttributeDetail.h
index 26d40ac3a38f6..820fb021d96fe 100644
--- a/mlir/lib/IR/AttributeDetail.h
+++ b/mlir/lib/IR/AttributeDetail.h
@@ -22,8 +22,10 @@
#include "mlir/Support/StorageUniquer.h"
#include "mlir/Support/ThreadLocalCache.h"
#include "llvm/ADT/APFloat.h"
-#include "llvm/ADT/PointerIntPair.h"
-#include "llvm/Support/TrailingObjects.h"
+#include "llvm/Support/Allocator.h"
+#include <atomic>
+#include <cstdint>
+#include <mutex>
namespace mlir {
namespace detail {
@@ -396,27 +398,112 @@ class DistinctAttributeUniquer {
Attribute referencedAttr);
};
-/// An allocator for distinct attribute storage instances. It uses thread local
-/// bump pointer allocators stored in a thread local cache to ensure the storage
-/// is freed after the destruction of the distinct attribute allocator.
-class DistinctAttributeAllocator {
+/// An allocator for distinct attribute storage instances. It uses thread-local
+/// bump pointer allocators stored in a thread-local cache to ensure the storage
+/// is freed after the destruction of the distinct attribute allocator. The way
+/// in which storage is allocated may be changed from a thread-local allocator
+/// to a shared, locked allocator. This can be used to prevent use-after-free
+/// errors if storage is allocated on a thread that has a lifetime less than the
+/// lifetime of the storage.
+class DistinctAttributeAllocator final {
public:
- DistinctAttributeAllocator() = default;
+ /// The allocation strategy for DistinctAttribute storage.
+ enum class AllocationMode : uint8_t {
+ /// Use a thread-local allocator. Lock-free, but storage lifetime is tied to
+ /// a thread, which may have shorter lifetime than the storage allocated
+ /// here.
+ ThreadLocal,
+ /// Use a single, shared allocator protected by a mutex. Slower due to
+ /// locking, but storage persists for the lifetime of the MLIR context.
+ SharedLocked,
+ };
+
+ DistinctAttributeAllocator() {
+ setAllocationMode(AllocationMode::ThreadLocal);
+ };
DistinctAttributeAllocator(DistinctAttributeAllocator &&) = delete;
DistinctAttributeAllocator(const DistinctAttributeAllocator &) = delete;
DistinctAttributeAllocator &
operator=(const DistinctAttributeAllocator &) = delete;
- /// Allocates a distinct attribute storage using a thread local bump pointer
- /// allocator to enable synchronization free parallel allocations.
+ /// Allocates a distinct attribute storage. In most cases this will be
+ /// using a thread-local allocator for synchronization free parallel accesses,
+ /// however if the PassManager's crash recovery is enabled this will incur
+ /// synchronization cost in exchange for persisting the storage.
DistinctAttrStorage *allocate(Attribute referencedAttr) {
- return new (allocatorCache.get().Allocate<DistinctAttrStorage>())
- DistinctAttrStorage(referencedAttr);
+ return std::invoke(allocatorFn.load(), *this, referencedAttr);
+ }
+
+ /// Sets the allocation mode. This is used by the MLIRContext to switch
+ /// allocation strategies, for example, during crash recovery.
+ void setAllocationMode(AllocationMode mode) {
+ switch (mode) {
+ case AllocationMode::ThreadLocal:
+ allocatorFn.store(&DistinctAttributeAllocator::allocateThreadLocalWrapper,
+ std::memory_order_release);
+ break;
+ case AllocationMode::SharedLocked:
+ allocatorFn.store(
+ &DistinctAttributeAllocator::allocateLockedSharedWrapper,
+ std::memory_order_release);
+ break;
+ }
}
private:
+ // Define some static wrappers for allocating using either the thread-local
+ // allocator or the shared locked allocator. We use these static wrappers to
+ // ensure that the width of the function pointer to these functions is the
+ // size of regular pointer on the platform (which is not always the case for
+ // pointer-to-member-function), and so should be handled by atomic
+ // instructions.
+ static DistinctAttrStorage *
+ allocateThreadLocalWrapper(DistinctAttributeAllocator &self,
+ Attribute referencedAttr) {
+ return self.allocateUsingThreadLocal(referencedAttr);
+ }
+
+ static DistinctAttrStorage *
+ allocateLockedSharedWrapper(DistinctAttributeAllocator &self,
+ Attribute referencedAttr) {
+ return self.allocateUsingLockedShared(referencedAttr);
+ }
+
+ DistinctAttrStorage *allocateUsingLockedShared(Attribute referencedAttr) {
+ std::scoped_lock<std::mutex> guard(sharedAllocatorMutex);
+ return doAllocate(sharedAllocator, referencedAttr);
+ }
+
+ DistinctAttrStorage *allocateUsingThreadLocal(Attribute referencedAttr) {
+ return doAllocate(allocatorCache.get(), referencedAttr);
+ };
+
+ DistinctAttrStorage *doAllocate(llvm::BumpPtrAllocator &allocator,
+ Attribute referencedAttr) {
+ return new (allocator.Allocate<DistinctAttrStorage>())
+ DistinctAttrStorage(referencedAttr);
+ };
+
+ /// Used to allocate Distinct Attribute storage without synchronization
ThreadLocalCache<llvm::BumpPtrAllocator> allocatorCache;
+
+ /// Used to allocate Distict Attribute storage with synchronization,
+ /// but without bounding the lifetime of the allocated memory to the
+ /// lifetime of a thread.
+ llvm::BumpPtrAllocator sharedAllocator;
+
+ /// Used to lock access to the shared allocator
+ std::mutex sharedAllocatorMutex;
+
+ using AllocatorFuncPtr =
+ DistinctAttrStorage *(*)(DistinctAttributeAllocator &, Attribute);
+
+ /// Atomic function pointer for the allocation strategy. Using a pointer to a
+ /// static member function as opposed to a non-static one should guarantee
+ /// that the function pointer fits into a standard pointer size, and so will
+ /// atomic instruction support for the corresponding width.
+ std::atomic<AllocatorFuncPtr> allocatorFn;
};
} // namespace detail
} // namespace mlir
diff --git a/mlir/lib/IR/CMakeLists.txt b/mlir/lib/IR/CMakeLists.txt
index 4cabac185171c..ac1110a6ebda4 100644
--- a/mlir/lib/IR/CMakeLists.txt
+++ b/mlir/lib/IR/CMakeLists.txt
@@ -68,5 +68,7 @@ add_mlir_library(MLIRIR
LINK_LIBS PUBLIC
MLIRSupport
+ PRIVATE
+ atomic
)
diff --git a/mlir/lib/IR/MLIRContext.cpp b/mlir/lib/IR/MLIRContext.cpp
index 716d9c85a377d..dd7999201f116 100644
--- a/mlir/lib/IR/MLIRContext.cpp
+++ b/mlir/lib/IR/MLIRContext.cpp
@@ -711,6 +711,13 @@ bool MLIRContext::isOperationRegistered(StringRef name) {
return RegisteredOperationName::lookup(name, this).has_value();
}
+void MLIRContext::disableThreadLocalStorage(bool disable) {
+ const auto mode =
+ disable ? DistinctAttributeAllocator::AllocationMode::SharedLocked
+ : DistinctAttributeAllocator::AllocationMode::ThreadLocal;
+ getImpl().distinctAttributeAllocator.setAllocationMode(mode);
+}
+
void Dialect::addType(TypeID typeID, AbstractType &&typeInfo) {
auto &impl = context->getImpl();
assert(impl.multiThreadedExecutionContext == 0 &&
diff --git a/mlir/lib/Pass/PassCrashRecovery.cpp b/mlir/lib/Pass/PassCrashRecovery.cpp
index b048ff9462392..ceeef353c2f2f 100644
--- a/mlir/lib/Pass/PassCrashRecovery.cpp
+++ b/mlir/lib/Pass/PassCrashRecovery.cpp
@@ -414,14 +414,36 @@ struct FileReproducerStream : public mlir::ReproducerStream {
LogicalResult PassManager::runWithCrashRecovery(Operation *op,
AnalysisManager am) {
+ // Notify the context to disable the use of thread-local storage for
+ // allocating attribute storage while the pass manager is running in a crash
+ // recovery context thread. Re-enable the thread-local storage upon function
+ // exit. This is required to persist any attribute storage allocated in
+ // thread-local storage during passes beyond the lifetime of the recovery
+ // context thread.
+ const bool threadingEnabled = getContext()->isMultithreadingEnabled();
+ if (threadingEnabled) {
+ crashRecoveryLock.lock();
+ getContext()->disableThreadLocalStorage();
+ }
+ auto guard = llvm::make_scope_exit([this]() {
+ getContext()->enableThreadLocalStorage();
+ crashRecoveryLock.unlock();
+ });
+
crashReproGenerator->initialize(getPasses(), op, verifyPasses);
// Safely invoke the passes within a recovery context.
LogicalResult passManagerResult = failure();
llvm::CrashRecoveryContext recoveryContext;
- recoveryContext.RunSafelyOnThread(
- [&] { passManagerResult = runPasses(op, am); });
+ const auto runPassesFn = [&] { passManagerResult = runPasses(op, am); };
+ if (threadingEnabled)
+ recoveryContext.RunSafelyOnThread(runPassesFn);
+ else
+ recoveryContext.RunSafely(runPassesFn);
crashReproGenerator->finalize(op, passManagerResult);
+
+ if (!threadingEnabled)
+ guard.release();
return passManagerResult;
}
diff --git a/mlir/unittests/IR/CMakeLists.txt b/mlir/unittests/IR/CMakeLists.txt
index d22afb3003e76..a46e64718dab9 100644
--- a/mlir/unittests/IR/CMakeLists.txt
+++ b/mlir/unittests/IR/CMakeLists.txt
@@ -6,6 +6,7 @@ add_mlir_unittest(MLIRIRTests
AttrTypeReplacerTest.cpp
Diagnostic.cpp
DialectTest.cpp
+ DistinctAttributeAllocatorTest.cpp
InterfaceTest.cpp
IRMapping.cpp
InterfaceAttachmentTest.cpp
diff --git a/mlir/unittests/IR/DistinctAttributeAllocatorTest.cpp b/mlir/unittests/IR/DistinctAttributeAllocatorTest.cpp
new file mode 100644
index 0000000000000..f03b3fab87ace
--- /dev/null
+++ b/mlir/unittests/IR/DistinctAttributeAllocatorTest.cpp
@@ -0,0 +1,54 @@
+//===- DistinctAttributeAllocatorTest.cpp - DistinctAttr storage alloc test
+//-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===-----------------------------------------------------------------------===//
+
+#include "gtest/gtest.h"
+
+#include "mlir/IR/Builders.h"
+#include "mlir/IR/BuiltinAttributes.h"
+#include "mlir/IR/MLIRContext.h"
+#include "llvm/ADT/ScopeExit.h"
+#include "llvm/Support/CrashRecoveryContext.h"
+#include <thread>
+
+using namespace mlir;
+
+//
+// Test that a DistinctAttr that is created on a separate thread does
+// not have its storage deleted when thread local storage is disabled
+// on the MLIRContext.
+//
+TEST(DistinctAttributeAllocatorTest, TestAttributeWellFormedAfterThreadJoin) {
+ MLIRContext ctx;
+ OpBuilder builder(&ctx);
+ DistinctAttr attr;
+
+ {
+ ctx.disableThreadLocalStorage();
+ auto guard =
+ llvm::make_scope_exit([&ctx]() { ctx.enableThreadLocalStorage(); });
+
+ std::thread t([&ctx, &attr]() {
+ attr = DistinctAttr::create(UnitAttr::get(&ctx));
+ ASSERT_TRUE(attr);
+ });
+ t.join();
+ }
+
+ // If the attribute storage got deleted after the thread joins (which we don't
+ // want) then trying to access it triggers an assert in Debug mode, and a
+ // crash otherwise. Run this in a CrashRecoveryContext to avoid bringing down
+ // the whole test suite if this test fails. Additionally, MSAN and/or TSAN
+ // should raise failures here if the attribute storage was deleted.
+ llvm::CrashRecoveryContext crc;
+ EXPECT_TRUE(crc.RunSafely([attr]() { (void)attr.getAbstractAttribute(); }));
+ EXPECT_TRUE(
+ crc.RunSafely([attr]() { (void)*cast<Attribute>(attr).getImpl(); }));
+
+ ASSERT_TRUE(attr);
+}
|
@gysit : you implemented this |
Looking at the implementation of disable multi-threading it seems like it destroys the thread-pool. My understanding is that this will also free the allocators in the thread local cache (i.e. the IR would break). A thread local cache, for this use case, only makes sense if the thread pool is alive during the entire lifetime of the mlir context. |
Do you mean "storage" instead of "cache" here? (to me a per-thread cache is always fine because it stays private to the thread). It seems that there is a deeper problem with this storage to address than what this PR is trying to do then! |
I meant the I tried to follow the PR and I do not yet understand why this solution works. Disabling the threading seems fundamentally incompatible with using a I wonder if it would make sense to abandon |
Hi both, thanks for your quick input on this.
A lock based solution is definitely much simpler IMO. In theory, a lock based solution pays the price of synchronisation overhead, in exchange for My last commit(s) revert to just a single, synchronized allocator. I will update the PR description to better reflect the changes once we're all agreed. Let me know what you think. |
@@ -31,6 +31,7 @@ | |||
#include "llvm/Support/CommandLine.h" | |||
#include "llvm/Support/Compiler.h" | |||
#include "llvm/Support/Debug.h" | |||
#include "llvm/Support/ManagedStatic.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Required on L74. I've juggled some headers around so I think this was being transitively included by accident prior.
Can you update the title / description of the PR? |
@joker-eph Updated 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice this is much simpler!
There are some typos but otherwise this looks good from my end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
LGTM modulo last nit.
Thanks for the approval. I don't have write access, please could this be merged for me? |
Sure, let's give others a chance to have a look as well and I can land tomorrow morning (should there be no comments no one else who merges the PR). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Can you rebase? The CI failure seems unrelated. |
b07b313
to
40fca8d
Compare
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/38311 Here is the relevant piece of the build log for the reference
|
This PR fixes a use-after-free error that happens when
DistinctAttr
instances are created within aPassManager
running with crash recovery enabled. The root cause is thatDistinctAttr
storage is allocated in a thread_local allocator, which is destroyed when the crash recovery thread joins, invalidating the storage.Moreover, even without crash reproduction disabling multithreading on the context will destroy the context's thread pool, and in turn delete the threadlocal storage. This means a call to
ctx->disableMulthithreading()
breaks the IR.This PR replaces the thread local allocator with a synchronised allocator that's shared between threads. This persists the lifetime of allocated DistinctAttr storage instances to the lifetime of the context.
Problem Details:
The
DistinctAttributeAllocator
uses aThreadLocalCache<BumpPtrAllocator>
for lock-free allocation ofDistinctAttr
storage in a multithreaded context. The issue occurs when aPassManager
is run with crash recovery (runWithCrashRecovery
), the pass pipeline is executed on a temporary thread spawned byllvm::CrashRecoveryContext
. AnyDistinctAttr
s created during this execution have their storage allocated in the thread_local cache of this temporary thread. When the thread joins, the thread_local storage is destroyed, freeing theDistinctAttr
s' memory. If this attribute is accessed later, e.g. when printing, it results in a use-after-free.As mentioned previously, this is also seen after creating some
DistinctAttr
s and then callingctx->disableMulthithreading()
.### Solution
DistinctAttrStorageAllocator
uses a synchronised, shared allocator instead of one wrapped in aThreadLocalCache
. The former is what stores the allocator in transient thread_local storage.Testing:
A C++ unit test has been added to validate this fix. (I was previously reproducing this failure with
mlir-opt
but I can no longer do so and I am unsure why.)Note: This is a 2nd attempt at my previous PR #128566 that was reverted in #133000. I believe I've addressed the TSAN and race condition concerns.