Post
52
Just for fun, let's run the Alibaba MNN benchmark on a DGX Spark!
From time to time, I look for something new or unusual in the AI world, and recently I stumbled upon MNN — a direct competitor to llama.cpp.
I found this project intriguing and set a small goal for myself: to run it on my DGX Spark. I was glad to see that MNN is open-source under the Apache 2.0 license, meaning I was free to fork and modify it.
However, MNN had a few issues out of the box:
- No support for CUDA 13.0
- No support for the Blackwell architecture sm_12
- No built-in support for CUDA benchmarking
I tackled these issues one by one and successfully compiled MNN on the DGX Spark. The benchmark results are currently quite low, but at least it works! Patch file here https://github.com/alibaba/MNN/issues/4289#issuecomment-4093931887
Here is the step-by-step guide on how I built MNN:
How to run the test:
- Download the MNN model: taobao-mnn/Qwen3-30B-A3B-MNN
- Run the benchmark:
It works!
Hopefully, the MNN developers will add official CUDA 13 support.
llmlaba
From time to time, I look for something new or unusual in the AI world, and recently I stumbled upon MNN — a direct competitor to llama.cpp.
I found this project intriguing and set a small goal for myself: to run it on my DGX Spark. I was glad to see that MNN is open-source under the Apache 2.0 license, meaning I was free to fork and modify it.
However, MNN had a few issues out of the box:
- No support for CUDA 13.0
- No support for the Blackwell architecture sm_12
- No built-in support for CUDA benchmarking
I tackled these issues one by one and successfully compiled MNN on the DGX Spark. The benchmark results are currently quite low, but at least it works! Patch file here https://github.com/alibaba/MNN/issues/4289#issuecomment-4093931887
Here is the step-by-step guide on how I built MNN:
mkdir mnn && cd mnn
# Get the code
git clone https://github.com/alibaba/MNN.git
cd MNN
# Reset repo to a specific commit
git reset --hard b1d06d68b3366183d157f0703d7b8a8b61ae55b3
# Apply patch for CUDA 13.0
git apply ../my_changes.patch
mkdir build && cd build
# Configure the project
cmake .. \
-DMNN_CUDA=ON \
-DMNN_BUILD_LLM=ON \
-DMNN_SUPPORT_TRANSFORMER_FUSE=ON \
-DCMAKE_BUILD_TYPE=Release
# Build libraries and executable binaries
cmake --build . --config Release -j$(nproc)
make -j$(nproc)How to run the test:
- Download the MNN model: taobao-mnn/Qwen3-30B-A3B-MNN
- Run the benchmark:
./MNN/build/llm_bench -m /path/to/qwen/config.json -a cuda -c 2 -p 512 -n 128 -kv true -rep 3It works!
Hopefully, the MNN developers will add official CUDA 13 support.