-
-
Notifications
You must be signed in to change notification settings - Fork 670
Atomic memory allocator #521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Atomic allocator for shared memory
Would it be feasible, as an alternative, to create a common wrapper around any existing memory allocator, given that the interface is always the same? I.e. guarantee that max. one thread at a time executes an allocation or free operation through a lock when any such operation is attempted? |
We need a global variable |
My though process was that, if we had a common wrapper, we could provide for example |
Allocation in shared memory, only possible with atomic operation otherwise,
threads can try to allocate different size blocks in same location which
will cause memory corruption.
There is an other way, we can restrict only main thread can allocate memory
by importing allocation functions to thread from main instance, but if my
understanding is correct, it’s not possible now since wasm only support
worker based threads. There is no way to pass exports from main thread to
workers. So we need to stick to atomic operation to allocate memory from
threads.
…On Thu 28. Feb 2019 at 4:03 PM, Daniel Wirtz ***@***.***> wrote:
My though process was that, if we had a common wrapper, we could provide
for example allocator/tlsf.atomic, allocator/buddy.atomic,
allocator/arena.atomic that use the respective mm but also the common
atomic wrapper that sits between the user and the mm. I'm not sure about
the locking overhead though. Might be significant when locking each attempt
to allocate/free. Not sure.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#521 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAMGZnOHbVCfwG5Jljkqt2ol5eBMkXoIks5vR-_dgaJpZM4bW7GB>
.
|
So, what if we'd designate let's say memory offset function memory_allocate(size: usize): usize {
while (atomic.cmpxchg(8, 0, 1)) {}
var ret = original_memory_allocate(size);
store(8, 0);
return ret;
} Wouldn't that work with any allocator if all threads then used the wrapper? |
Yes, this will work
…On Thu 28. Feb 2019 at 5:54 PM, Daniel Wirtz ***@***.***> wrote:
So, what if we'd designate let's say memory offset 8 to hold a value
whether any thread is currently within either allocate or free? Something
like
function memory_allocate(size: usize): usize {
while (!atomic.cmpxchg(8, 0, 1)) {}
var ret = original_memory_allocate(size);
store(8, 0);
return ret;
}
Wouldn't that work with any allocator if all threads then used the wrapper?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#521 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAMGZv5E-9-6q1ow109jLXDDYW0Aym-gks5vSAnMgaJpZM4bW7GB>
.
|
May be better use futex for this long lock section? I mean Atomic.wait/notify |
wait/notify should be slightly faster than while loop.
…On Thu 28. Feb 2019 at 8:29 PM, Max Graey ***@***.***> wrote:
May better use futex for this long section? I mean Atomic.wait/notify
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#521 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAMGZj2qtiWVSu9jqEwz_6Sy0iHDG20fks5vSC35gaJpZM4bW7GB>
.
|
So, according to the example on the threads spec, a working mechanism here could be function lock(addr: usize): void {
while (atomic_cmpxchg<i32>(addr, 0, 1)) {
atomic.wait<i32>(addr, 1, -1);
}
}
function unlock(addr: usize): void {
atomic.store<i32>(addr, 0);
atomic.notify<i32>(addr, 1);
}
const MM_LOCK: usize = 8;
function memory_allocate(size: usize): usize {
lock(MM_LOCK);
var ret = original_allocate(size);
unlock(MM_LOCK);
return ret;
}
function memory_free(addr: usize): void {
lock(MM_LOCK);
original_free(addr);
unlock(MM_LOCK);
} Now, if shared memory is enabled with the respective compiler flag, the compiler would automatically inject Special care must be taken when initializing a memory allocator of course. The main thread will usually set it up while threads inherit its state, which can be done conditionally based on |
More anvanced mutex implementation with spin locks: const SPIN_LOCK_ITER_LIMIT: i32 = 128;
function mutexLock(addr: usize): void {
var stat = 0;
for (let i = 0; i < SPIN_LOCK_ITER_LIMIT; i++) {
stat = atomic.cmpxchg<i32>(addr, 0, 1)
if (!stat) break;
}
if (stat == 1) {
stat = atomic.xchg<i32>(addr, 2);
}
while (stat) {
atomic.wait<i32>(addr, 0, 2); // <-- not sure about this params
stat = atomic.xchg<i32>(addr, 2);
}
}
function mutexUnlock(addr: usize): void {
if (addr == 2) addr = 0;
else if (atomic.xchg<i32>(m, 0) == 1) return;
for (let i = 0; i < SPIN_LOCK_ITER_LIMIT; i++) {
if (addr && atomic.cmpxchg<i32>(addr, 1, 2)) return;
}
atomic.notify<i32>(addr, 1);
} |
What does it do? Spinlock first and if that doesn't work, wait, in order to reduce context switches? |
It just do spin lock limited to 128 iterations and after fallback to |
Ideally after each iteration we should signalize cpu to |
I see, that's the usual tradeoff between wasting cycles and switching context then. I wonder how that'd compare to a naive wait/notify approach without a way to signal |
yeah, definitely need benchmark. |
One more remaining building block for a shared memory manager/gc, apart from using locking, appears to be that the current implementations utilize globals to store some of their state. If I'm not mistaken, global states (except immutable globals like __heap_base) are not shared and their values differ between threads, so the information stored there must be synchronized somehow, like storing inside of the MM/GC control structure in memory. |
Is mutable global thread safe? |
globals are not shared between instances. There is no possible way to use globals unless they are We need to store all data in shared memory between the instances. |
Closing this PR as part of 2020 vacuum as it appears to be outdated. In general there are still some open questions regarding a thread-safe allocator, in particular whether we should rather think about a more JS-y approach like workers and postMessage, keeping allocation local to each thread. |
Atomic allocator for shared memory