Instructions to use CodeDoes/GLM-5-abliterated with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use CodeDoes/GLM-5-abliterated with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="CodeDoes/GLM-5-abliterated")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("CodeDoes/GLM-5-abliterated")
model = AutoModelForCausalLM.from_pretrained("CodeDoes/GLM-5-abliterated")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use CodeDoes/GLM-5-abliterated with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "CodeDoes/GLM-5-abliterated"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CodeDoes/GLM-5-abliterated",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/CodeDoes/GLM-5-abliterated

SGLang

How to use CodeDoes/GLM-5-abliterated with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "CodeDoes/GLM-5-abliterated" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CodeDoes/GLM-5-abliterated",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "CodeDoes/GLM-5-abliterated" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CodeDoes/GLM-5-abliterated",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use CodeDoes/GLM-5-abliterated with Docker Model Runner:
```
docker model run hf.co/CodeDoes/GLM-5-abliterated
```

GLM-5 Abliterated (BF16)

""""""wont recommend using this, please let me know if u do""""" . This is an abliterated version of zai-org/GLM-5 (744B MoE, 40B active parameters).

What is abliteration?

Abliteration removes the "refusal direction" from the model weights using weight orthogonalization. This allows the model to respond to a wider range of prompts without safety refusals, while preserving general capability.

Method

Computed refusal directions for all 78 layers using contrastive activation pairs (harmful vs harmless prompts)
Applied weight orthogonalization to layers 15-54:
- self_attn.o_proj.weight (attention output projection)
- mlp.shared_experts.down_proj.weight (shared expert down projection)
Alpha = 1.0, 80 weight matrices modified total

Details

Base model: zai-org/GLM-5 (744B MoE, BF16)
Modified layers: 15-54 (40 of 78 total layers)
Weights modified: 80 (o_proj + shared_experts.down_proj per layer)
Precision: BF16 (full precision, no quantization artifacts)

Disclaimer

This model is provided for research purposes. Users are responsible for ensuring appropriate use.

Downloads last month: 637

Safetensors

Model size

754B params

Tensor type

BF16

F32

Model tree for CodeDoes/GLM-5-abliterated

Base model

zai-org/GLM-5

Finetuned

(37)

this model