Unable to use with ollama

#4
by rdodev - opened

I'm unable to run this quant with the ollama instructions given

❯ ollama run hf.co/unsloth/Qwen3.5-9B-GGUF:Q8_0
Error: 500 Internal Server Error: unable to load model: /Users/.../.ollama/models/blobs/sha256-809626574d0cb43d4becfa56169980da2bb448f2299270f7be443cb89d0a6ae4

I don't recommend ollama.
just use llama.cpp or lmstudio

seems ollama side is working on it

https://unsloth.ai/docs/models/qwen3.5

Currently no Qwen3.5 GGUF works in Ollama. Use llama.cpp compatible backends.

https://github.com/ollama/ollama/issues/14575

Models downloaded from HuggingFace are split, text and vision GGUFs in separate files. Split models must run on the llama.cpp engine. The llama.cpp engine in ollama does not support qwen35/qwen35moe architecture yet, #14134 will merge the required support. A text-only version of the model, by removing the vision GGUF, will work.

Too many tools default to ollama and can't change that, Raycast as an example

https://github.com/ollama/ollama/issues/14575#issuecomment-3989918451

Models downloaded from HuggingFace are split, text and vision GGUFs in separate files. Split models must run on the llama.cpp engine. The llama.cpp engine in ollama does not support qwen35/qwen35moe architecture yet, #14134 will merge the required support. A text-only version of the model, by removing the vision GGUF, will work.

Error: 500 Internal Server Error: unable to load model: /Users/.../.ollama/models/blobs/sha256-809626574d0cb43d4becfa56169980da2bb448f2299270f7be443cb89d0a6ae4

I get same error

Sign up or log in to comment