Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

41,330

Full-text search

Active filters: 4-bit

MaziyarPanahi/Saul-Instruct-v1-GGUF

Text Generation • 7B • Updated Mar 10, 2024 • 217 • 9

unsloth/llama-3-8b-bnb-4bit

Text Generation • 8B • Updated Jan 7, 2025 • 62.5k • 203

zementalist/llama-3-8B-chat-psychotherapist

Text Generation • 8B • Updated Apr 29, 2024 • 20 • 30

unsloth/mistral-7b-v0.3-bnb-4bit

Text Generation • 7B • Updated Nov 22, 2024 • 347k • 22

unsloth/Meta-Llama-3.1-8B-bnb-4bit

Text Generation • 8B • Updated Feb 15, 2025 • 54.1k • 109

unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit

Text Generation • 8B • Updated Feb 15, 2025 • 227k • 92

shuyuej/Llama-Guard-3-8B-GPTQ

Text Generation • 8B • Updated Jul 25, 2024 • 8 • 1

MaziyarPanahi/Qwen2.5-1.5B-Instruct-GGUF

Text Generation • 2B • Updated Sep 18, 2024 • 124k • 8

Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4

Text Generation • 8B • Updated Nov 18, 2024 • 943k • 11

Qwen/Qwen2.5-Coder-7B-Instruct-AWQ

Text Generation • 8B • Updated Nov 18, 2024 • 327k • 19

unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit

Text Generation • 4B • Updated Nov 12, 2024 • 41.6k • 10

unsloth/Llama-3.2-1B-Instruct-bnb-4bit

Text Generation • 1B • Updated Jan 23, 2025 • 19.8k • 22

NetoAISolutions/TSLAM-4B

4B • Updated Dec 4, 2025 • 38 • 20

Ayush12a/llama3.1_finetuned_on_indian_legal_dataset

Text Generation • 8B • Updated Oct 21, 2024 • 4

Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int4

Text Generation • 33B • Updated Nov 18, 2024 • 6.15k • 24

Qwen/Qwen2.5-Coder-14B-Instruct-AWQ

Text Generation • 15B • Updated Jan 12, 2025 • 37.1k • 13

unsloth/llava-1.5-7b-hf-bnb-4bit

Image-Text-to-Text • 4B • Updated Feb 13, 2025 • 175k • 7

ibnzterrell/Meta-Llama-3.3-70B-Instruct-AWQ-INT4

Text Generation • 71B • Updated Dec 7, 2024 • 132k • 29

mlx-community/DeepSeek-R1-Distill-Qwen-32B-4bit

Text Generation • 5B • Updated Feb 26, 2025 • 1.41k • 45

casperhansen/mistral-small-24b-instruct-2501-awq

24B • Updated Jan 30, 2025 • 2.98k • 9

FINGU-AI/Phi-4-RRStock

Text Generation • 12B • Updated Feb 5, 2025 • 6 • 2

Qwen/Qwen2.5-VL-72B-Instruct-AWQ

Image-Text-to-Text • 74B • Updated Mar 7, 2025 • 141k • 72

MaziyarPanahi/gemma-3-4b-it-GGUF

Text Generation • 4B • Updated Mar 12, 2025 • 175k • 17

unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit

Text-to-Speech • 3B • Updated Mar 24, 2025 • 42.8k • 16

ethicalabs/TowerInstruct-7B-v0.2-mlx-4Bit

Translation • 1B • Updated Oct 25, 2025 • 16 • 2

unsloth/Qwen3-1.7B-unsloth-bnb-4bit

Text Generation • 2B • Updated May 13, 2025 • 31k • 12

unsloth/Qwen3-30B-A3B-bnb-4bit

31B • Updated May 13, 2025 • 608 • 20

MaziyarPanahi/Qwen3-14B-GGUF

Text Generation • 15B • Updated Apr 28, 2025 • 178k • 5

Qwen/Qwen3-8B-AWQ

Text Generation • 8B • Updated May 21, 2025 • 125k • 33

Qwen/Qwen3-30B-A3B-GPTQ-Int4

Text Generation • 31B • Updated May 21, 2025 • 302k • 44