Doug's picture

1 1

Doug PRO

dougeeai

·

dougeeai

AI & ML interests

CUDA, Sovereign AI, OSS GenAI, LLM fine-tuning, VRAM optimization, RAG, SLM Agents

Recent Activity

posted an update about 8 hours ago

## Llama-cpp-python wheels for Windows - update Pre-compiled wheels for `llama-cpp-python` on Windows. No Visual Studio, no CUDA Toolkit setup. `pip install` and run. ### New in this update - **sm_120 (consumer/workstation Blackwell) support.** A single wheel now covers both sm_100 (datacenter) and sm_120 (RTX 5090 / 5080 / 5070 / 5060 / 5050, RTX PRO 6000 / 5000 / 4500 / 4000 / 2000 Blackwell). - **llama-cpp-python 0.3.20** across all four architectures (Blackwell, Ada, Ampere, Turing). Brings Gemma 4 support via the updated llama.cpp core. - **One wheel covers Python 3.10 through 3.13.** The 0.3.20 builds use `py3-none` tagging, no more per-interpreter builds. - **Fixed three mislabeled 0.3.16 sm_86 wheels** that linked against the wrong CUDA cuBLAS. Properly-built replacement is available. ### Coverage - **GPUs:** RTX 20 / 30 / 40 / 50 series, RTX PRO Blackwell workstation, B100 / B200 / B300 datacenter - **CUDA:** 11.8 / 12.1 / 13.0 - **Python:** 3.10, 3.11, 3.12, 3.13 ### Download https://github.com/dougeeai/llama-cpp-python-wheels Linux wheels still on the roadmap. File an issue if you need a specific configuration built. Tags: #llama-cpp #gguf #windows #prebuilt #blackwell #rtx5090 #rtxpro6000 #rtxproblackwell #gemma4

liked a model 5 months ago

microsoft/phi-4-gguf

upvoted an article 5 months ago

🌳 QAT: The Art of Growing a Bonsai Model

View all activity

Organizations

None yet

dougeeai 's collections 4