Skip to content

Add a port of mimalloc, a fast and scalable multithreaded allocator #20651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 56 commits into from
Nov 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
3c0ae3a
import
kripken Nov 6, 2023
cc53529
comment
kripken Nov 6, 2023
fcef3bc
comment [ci skip]
kripken Nov 6, 2023
910e0fa
yolo
kripken Nov 6, 2023
4b41147
yolo
kripken Nov 6, 2023
dfb96d1
yolo [ci skip]
kripken Nov 6, 2023
9c451f3
yolo [ci skip]
kripken Nov 6, 2023
4715e80
yolo [ci skip]
kripken Nov 6, 2023
ff348f9
undo
kripken Nov 6, 2023
415fa21
yolo
kripken Nov 6, 2023
072bd56
yolo
kripken Nov 6, 2023
28afc20
prep for benchmark
kripken Nov 6, 2023
c55872c
work
kripken Nov 7, 2023
862e74d
bad [ci skip]
kripken Nov 7, 2023
f717f84
Undo
kripken Nov 7, 2023
373c343
fix
kripken Nov 7, 2023
2dcc313
worksen [ci skip]
kripken Nov 7, 2023
6d29f45
test mode
kripken Nov 7, 2023
46d50fd
fix
kripken Nov 7, 2023
9139636
builden
kripken Nov 7, 2023
fdb22cf
testen
kripken Nov 7, 2023
e0a1512
flake
kripken Nov 7, 2023
59f363c
focus
kripken Nov 7, 2023
795ed0f
work
kripken Nov 7, 2023
fffa2f5
test
kripken Nov 7, 2023
8cfbaad
test
kripken Nov 7, 2023
6ce7901
work
kripken Nov 7, 2023
5df421f
test
kripken Nov 7, 2023
c9e44d6
work
kripken Nov 7, 2023
1dd9d37
test
kripken Nov 7, 2023
aedf114
test
kripken Nov 7, 2023
e55e172
undo
kripken Nov 7, 2023
45aacd7
docs
kripken Nov 7, 2023
b3e580d
align
kripken Nov 7, 2023
13e1b4c
undo
kripken Nov 7, 2023
90b1570
undo
kripken Nov 7, 2023
9e1210f
fix
kripken Nov 7, 2023
923bb8a
undo
kripken Nov 7, 2023
9f252d1
undo
kripken Nov 7, 2023
f96a478
Merge remote-tracking branch 'origin/main' into mimalloc
kripken Nov 7, 2023
c73b406
undo
kripken Nov 7, 2023
fe4090d
update
kripken Nov 7, 2023
fbca4bc
docs
kripken Nov 7, 2023
bebd586
fix
kripken Nov 7, 2023
e520fcd
fix
kripken Nov 7, 2023
f40dd26
fix
kripken Nov 7, 2023
3ae0e2a
fix
kripken Nov 7, 2023
22ac2b0
Merge remote-tracking branch 'origin/main' into mimalloc
kripken Nov 8, 2023
193311f
notes [ci skip]
kripken Nov 8, 2023
6cb2248
comment
kripken Nov 8, 2023
879d9b9
use glob as much as possible [ci skip]
kripken Nov 8, 2023
bdddfce
Rename test file to test_malloc_multithreading
kripken Nov 8, 2023
4d2393d
single line
kripken Nov 8, 2023
3ffdce8
Merge remote-tracking branch 'origin/main' into mimalloc
kripken Nov 8, 2023
512dccf
separate lines
kripken Nov 8, 2023
9b9232c
Merge remote-tracking branch 'origin/main' into mimalloc
kripken Nov 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions ChangeLog.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ See docs/process.md for more on how version tagging works.

3.1.50 (in development)
-----------------------
- Add a port of mimalloc, a fast and scalable multithreaded allocator. To use
it, build with `-sMALLOC=mimalloc`. (#20651)
- When compiling, Emscripten will now invoke `clang` or `clang++` depending only
on whether `emcc` or `em++` was run. Previously it would determine which to
run based on individual file extensions. One side effect of this is that you
Expand Down
2 changes: 2 additions & 0 deletions embuilder.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@
'libemmalloc-memvalidate',
'libemmalloc-verbose',
'libemmalloc-memvalidate-verbose',
'libmimalloc',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason someone would want to use mimalloc instead of dlmalloc when not building with threads?

Copy link
Member Author

@kripken kripken Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mimalloc is actually faster even with a single core, see the first chart. (But it is substantially larger so I doubt it would be a common thing.)

'libmimalloc-mt',
'libGL',
'libhtml5',
'libsockets',
Expand Down
9 changes: 9 additions & 0 deletions site/source/docs/optimizing/Optimizing-Code.rst
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,15 @@ Enable :ref:`debugging-EMCC_DEBUG` to output files for each compilation phase, i

.. _optimizing-code-unsafe-optimisations:

Allocation
----------

The default ``malloc/free`` implementation used is ``dlmalloc``. You can also
pick ``emmalloc`` (``-sMALLOC=emmalloc``) which is smaller but less fast, or
``mimalloc`` (``-sMALLOC=mimalloc``) which is larger but scales better in a
multithreaded application with contention on ``malloc/free`` (see
:ref:`Allocator_performance`).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should consider making it the default for threaded builds onces we have enough trust in it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the code size and memory overhead, I'm not sure. Maybe. Perhaps if we can make it smaller and leaner (like fixing that emmalloc alignment issue #20645).


Unsafe optimizations
====================

Expand Down
18 changes: 18 additions & 0 deletions site/source/docs/porting/pthreads.rst
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,24 @@ The Emscripten implementation for the pthreads API should follow the POSIX stand

Also note that when compiling code that uses pthreads, an additional JavaScript file ``NAME.worker.js`` is generated alongside the output .js file (where ``NAME`` is the basename of the main file being emitted). That file must be deployed with the rest of the generated code files. By default, ``NAME.worker.js`` will be loaded relative to the main HTML page URL. If it is desirable to load the file from a different location e.g. in a CDN environment, then one can define the ``Module.locateFile(filename)`` function in the main HTML ``Module`` object to return the URL of the target location of the ``NAME.worker.js`` entry point. If this function is not defined in ``Module``, then the default location relative to the main HTML file is used.

.. _Allocator_performance:

Allocator performance
=====================

The default system allocator in Emscripten, ``dlmalloc``, is very efficient in a
single-threaded program, but it has a single global lock which means if there is
contention on ``malloc`` then you can see overhead. You can use
`mimalloc <https://github.com/microsoft/mimalloc>`_
instead by using ``-sMALLOC=mimalloc``, which is a more sophisticated allocator
tuned for multithreaded performance. ``mimalloc`` has separate allocation
contexts on each thread, allowing performance to scale a lot better under
``malloc/free`` contention.

Note that ``mimalloc`` is larger in code size than ``dlmalloc``, and also uses
more memory at runtime (so you may need to adjust ``INITIAL_MEMORY`` to a higher
value), so there are tradeoffs here.

Running code and tests
======================

Expand Down
3 changes: 3 additions & 0 deletions src/settings.js
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,9 @@ var STACK_SIZE = 64*1024;
// * emmalloc-verbose - use emmalloc with assertions + verbose logging.
// * emmalloc-memvalidate-verbose - use emmalloc with assertions + heap
// consistency checking + verbose logging.
// * mimalloc - a powerful mulithreaded allocator. This is recommended in
// large applications that have malloc() contention, but it is
// larger and uses more memory.
// * none - no malloc() implementation is provided, but you must implement
// malloc() and free() yourself.
// dlmalloc is necessary for split memory and other special modes, and will be
Expand Down
21 changes: 21 additions & 0 deletions system/lib/mimalloc/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2018-2021 Microsoft Corporation, Daan Leijen

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
7 changes: 7 additions & 0 deletions system/lib/mimalloc/README.emscripten
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@

This contains mimalloc 4e50d6714d471b72b2285e25a3df6c92db944593 with
Emscripten backend additions.

Origin: https://github.com/microsoft/mimalloc

For the Emscripten port design see src/prim/emscripten/prim.c
66 changes: 66 additions & 0 deletions system/lib/mimalloc/include/mimalloc-new-delete.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
/* ----------------------------------------------------------------------------
Copyright (c) 2018-2020 Microsoft Research, Daan Leijen
This is free software; you can redistribute it and/or modify it under the
terms of the MIT license. A copy of the license can be found in the file
"LICENSE" at the root of this distribution.
-----------------------------------------------------------------------------*/
#pragma once
#ifndef MIMALLOC_NEW_DELETE_H
#define MIMALLOC_NEW_DELETE_H

// ----------------------------------------------------------------------------
// This header provides convenient overrides for the new and
// delete operations in C++.
//
// This header should be included in only one source file!
//
// On Windows, or when linking dynamically with mimalloc, these
// can be more performant than the standard new-delete operations.
// See <https://en.cppreference.com/w/cpp/memory/new/operator_new>
// ---------------------------------------------------------------------------
#if defined(__cplusplus)
#include <new>
#include <mimalloc.h>

#if defined(_MSC_VER) && defined(_Ret_notnull_) && defined(_Post_writable_byte_size_)
// stay consistent with VCRT definitions
#define mi_decl_new(n) mi_decl_nodiscard mi_decl_restrict _Ret_notnull_ _Post_writable_byte_size_(n)
#define mi_decl_new_nothrow(n) mi_decl_nodiscard mi_decl_restrict _Ret_maybenull_ _Success_(return != NULL) _Post_writable_byte_size_(n)
#else
#define mi_decl_new(n) mi_decl_nodiscard mi_decl_restrict
#define mi_decl_new_nothrow(n) mi_decl_nodiscard mi_decl_restrict
#endif

void operator delete(void* p) noexcept { mi_free(p); };
void operator delete[](void* p) noexcept { mi_free(p); };

void operator delete (void* p, const std::nothrow_t&) noexcept { mi_free(p); }
void operator delete[](void* p, const std::nothrow_t&) noexcept { mi_free(p); }

mi_decl_new(n) void* operator new(std::size_t n) noexcept(false) { return mi_new(n); }
mi_decl_new(n) void* operator new[](std::size_t n) noexcept(false) { return mi_new(n); }

mi_decl_new_nothrow(n) void* operator new (std::size_t n, const std::nothrow_t& tag) noexcept { (void)(tag); return mi_new_nothrow(n); }
mi_decl_new_nothrow(n) void* operator new[](std::size_t n, const std::nothrow_t& tag) noexcept { (void)(tag); return mi_new_nothrow(n); }

#if (__cplusplus >= 201402L || _MSC_VER >= 1916)
void operator delete (void* p, std::size_t n) noexcept { mi_free_size(p,n); };
void operator delete[](void* p, std::size_t n) noexcept { mi_free_size(p,n); };
#endif

#if (__cplusplus > 201402L || defined(__cpp_aligned_new))
void operator delete (void* p, std::align_val_t al) noexcept { mi_free_aligned(p, static_cast<size_t>(al)); }
void operator delete[](void* p, std::align_val_t al) noexcept { mi_free_aligned(p, static_cast<size_t>(al)); }
void operator delete (void* p, std::size_t n, std::align_val_t al) noexcept { mi_free_size_aligned(p, n, static_cast<size_t>(al)); };
void operator delete[](void* p, std::size_t n, std::align_val_t al) noexcept { mi_free_size_aligned(p, n, static_cast<size_t>(al)); };
void operator delete (void* p, std::align_val_t al, const std::nothrow_t&) noexcept { mi_free_aligned(p, static_cast<size_t>(al)); }
void operator delete[](void* p, std::align_val_t al, const std::nothrow_t&) noexcept { mi_free_aligned(p, static_cast<size_t>(al)); }

void* operator new (std::size_t n, std::align_val_t al) noexcept(false) { return mi_new_aligned(n, static_cast<size_t>(al)); }
void* operator new[](std::size_t n, std::align_val_t al) noexcept(false) { return mi_new_aligned(n, static_cast<size_t>(al)); }
void* operator new (std::size_t n, std::align_val_t al, const std::nothrow_t&) noexcept { return mi_new_aligned_nothrow(n, static_cast<size_t>(al)); }
void* operator new[](std::size_t n, std::align_val_t al, const std::nothrow_t&) noexcept { return mi_new_aligned_nothrow(n, static_cast<size_t>(al)); }
#endif
#endif

#endif // MIMALLOC_NEW_DELETE_H
67 changes: 67 additions & 0 deletions system/lib/mimalloc/include/mimalloc-override.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
/* ----------------------------------------------------------------------------
Copyright (c) 2018-2020 Microsoft Research, Daan Leijen
This is free software; you can redistribute it and/or modify it under the
terms of the MIT license. A copy of the license can be found in the file
"LICENSE" at the root of this distribution.
-----------------------------------------------------------------------------*/
#pragma once
#ifndef MIMALLOC_OVERRIDE_H
#define MIMALLOC_OVERRIDE_H

/* ----------------------------------------------------------------------------
This header can be used to statically redirect malloc/free and new/delete
to the mimalloc variants. This can be useful if one can include this file on
each source file in a project (but be careful when using external code to
not accidentally mix pointers from different allocators).
-----------------------------------------------------------------------------*/

#include <mimalloc.h>

// Standard C allocation
#define malloc(n) mi_malloc(n)
#define calloc(n,c) mi_calloc(n,c)
#define realloc(p,n) mi_realloc(p,n)
#define free(p) mi_free(p)

#define strdup(s) mi_strdup(s)
#define strndup(s,n) mi_strndup(s,n)
#define realpath(f,n) mi_realpath(f,n)

// Microsoft extensions
#define _expand(p,n) mi_expand(p,n)
#define _msize(p) mi_usable_size(p)
#define _recalloc(p,n,c) mi_recalloc(p,n,c)

#define _strdup(s) mi_strdup(s)
#define _strndup(s,n) mi_strndup(s,n)
#define _wcsdup(s) (wchar_t*)mi_wcsdup((const unsigned short*)(s))
#define _mbsdup(s) mi_mbsdup(s)
#define _dupenv_s(b,n,v) mi_dupenv_s(b,n,v)
#define _wdupenv_s(b,n,v) mi_wdupenv_s((unsigned short*)(b),n,(const unsigned short*)(v))

// Various Posix and Unix variants
#define reallocf(p,n) mi_reallocf(p,n)
#define malloc_size(p) mi_usable_size(p)
#define malloc_usable_size(p) mi_usable_size(p)
#define cfree(p) mi_free(p)

#define valloc(n) mi_valloc(n)
#define pvalloc(n) mi_pvalloc(n)
#define reallocarray(p,s,n) mi_reallocarray(p,s,n)
#define reallocarr(p,s,n) mi_reallocarr(p,s,n)
#define memalign(a,n) mi_memalign(a,n)
#define aligned_alloc(a,n) mi_aligned_alloc(a,n)
#define posix_memalign(p,a,n) mi_posix_memalign(p,a,n)
#define _posix_memalign(p,a,n) mi_posix_memalign(p,a,n)

// Microsoft aligned variants
#define _aligned_malloc(n,a) mi_malloc_aligned(n,a)
#define _aligned_realloc(p,n,a) mi_realloc_aligned(p,n,a)
#define _aligned_recalloc(p,s,n,a) mi_aligned_recalloc(p,s,n,a)
#define _aligned_msize(p,a,o) mi_usable_size(p)
#define _aligned_free(p) mi_free(p)
#define _aligned_offset_malloc(n,a,o) mi_malloc_aligned_at(n,a,o)
#define _aligned_offset_realloc(p,n,a,o) mi_realloc_aligned_at(p,n,a,o)
#define _aligned_offset_recalloc(p,s,n,a,o) mi_recalloc_aligned_at(p,s,n,a,o)

#endif // MIMALLOC_OVERRIDE_H
Loading