Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
Bitnet.cpp enhances edge inference for ternary LLMs using a novel mixed-precision matrix multiplication library, achieving significant speed improvements over baselines.
- 10 authors
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
Bitnet.cpp enhances edge inference for ternary LLMs using a novel mixed-precision matrix multiplication library, achieving significant speed improvements over baselines.
- 10 authors
