This repo has:
- compiled model libraries for webgpu
- instructions on how to build all the necessary dependencies from source
Note
The instructions are ultra-specific to my setup — MacBook Air M2. I have no idea if they'll work for you.
Installed conda using miniconda
On the first time or when creating a new session, make sure to run:
# activate conda for session
source ~/miniconda3/bin/activate
conda init
# if environment is already created
conda activate mlc-dev
# or create a new one, and then activate
conda create -n mlc-devInstall the necessary packages for tvm and mlc-llm
conda install -c conda-forge \
"llvmdev>=15" \
"cmake>=3.24" \
compilers \
git \
rust \
numpy \
psutil \
python=3.13Some initial environment variables to use:
export CC=$CONDA_PREFIX/bin/clang
export CXX=$CONDA_PREFIX/bin/clang++
export LDFLAGS="-L$CONDA_PREFIX/lib"
export CPPFLAGS="-I$CONDA_PREFIX/include"More will be added later.
I'm using direnv and a .env file to manage these.
Following: https://tvm.apache.org/docs/install/from_source.html
# clone from GitHub
git clone --recursive https://github.com/apache/tvm.git && cd tvm
# create the build directory
rm -rf build && mkdir build && cd build
# specify build requirements in `config.cmake`
cp ../cmake/config.cmake .
# update config
echo -e "set(CMAKE_BUILD_TYPE RelWithDebInfo)\n\
set(USE_LLVM \"llvm-config --ignore-libllvm --link-static\")\n\
set(HIDE_PRIVATE_SYMBOLS ON)" >> config.cmake
# Configure cmake flags
cmake \
-DCMAKE_PREFIX_PATH=$CONDA_PREFIX \
-DCMAKE_FIND_ROOT_PATH=$CONDA_PREFIX \
..
# Build
cmake --build . --parallel $(sysctl -n hw.ncpu)
# install the tvm-ffi package.
cd ../3rdparty/tvm-ffi
pip install -e .
# install tvm in root environment
cd ../../..
# Add to the session or .env file (update accordingly)
export TVM_LIBRARY_PATH=./tvm/build
export TVM_HOME=./tvm
export PYTHONPATH=$TVM_HOME/python:$PYTHONPATH
pip install -e ./tvm/python
# validate it works
python -c "import tvm; print(tvm.__file__)"Following: https://llm.mlc.ai/docs/install/mlc_llm.html#option-2-build-from-source
From the root:
git clone --recursive https://github.com/mlc-ai/mlc-llm.git && cd mlc-llm/
# create build directory
mkdir -p build && cd build
# generate build configuration
python ../cmake/gen_cmake_config.py
# build mlc_llm libraries
cmake \
-DCMAKE_PREFIX_PATH=$CONDA_PREFIX \
-DCMAKE_FIND_ROOT_PATH=$CONDA_PREFIX \
-DCMAKE_POLICY_VERSION_MINIMUM=3.5 \
..
make -j $(sysctl -n hw.ncpu) && cd ..
# install as a a pip project but with M2 modifications
cd ./mlc-llm/python/requirements.txt
# comment out flashinfer-python
# from the root of the repo
pip install -e ./mlc-llm/python
# add to the .env
export KMP_DUPLICATE_LIB_OK=TRUE
# verify
mlc_llm --help
# if this error occurs:
./tvm/3rdparty/tvm-ffi/python/tvm_ffi/_optional_torch_c_dlpack.py:559: UserWarning: Failed to load torch c dlpack extension: Ninja is required to load C++ extensions (pip install ninja to get it),EnvTensorAllocator will not be enabled.
warnings.warn(
# then
pip install ninjaFollow the official docs: https://emscripten.org/docs/getting_started/downloads.html
But make sure to use 3.1.56.
If returning to a new session:
cd emsdk
./emsdk activate 3.1.56
source ./emsdk_env.sh
# validate
emcc --versionNext, follow here
especially the first time:
cd mlc-llm
./web/prep_emcc_deps.shFollow the guide here: https://llm.mlc.ai/docs/deploy/webllm.html#bring-your-own-model-library