Anvaya-Rabbit 2.7B
India's first sovereign SSM-based language model.
Non-transformer architecture. No attention mechanism. Constitutional training via Gurukul. 7 patents filed at IP India.
⚠️ Checkpoint Deprecation Notice
| Checkpoint | Status | Notes |
|---|---|---|
Anvaya-Rabbit-2.7B-0.55-base.pt |
✅ CURRENT | Wikipedia warmup complete, CE 0.993x |
| Any prior checkpoint | ⚠️ DEPRECATED | Do not use for inference |
Prior checkpoints are retained for research transparency.
The current checkpoint reflects iterative refinement of the
ANVAYA RtaSSM architecture and training pipeline.
Always use the latest -base.pt for any downstream work.
What's in this repo
| Tier | File | Use this when… |
|---|---|---|
| Base | base/Anvaya-Rabbit-2.7B-0.55-base.pt |
You want raw pretrained weights for your own fine-tuning |
Instruct and Imprint tiers are in preparation (epoch 2 → SFT → imprint pipeline).
Quickstart
pip install rtaforge transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
tokenizer.add_special_tokens({"additional_special_tokens": ["<|im_start|>", "<|im_end|>"]})
model = AutoModelForCausalLM.from_pretrained(
"RtaForge/Anvaya-Rabbit-2.7B",
trust_remote_code=True,
torch_dtype="bfloat16",
device_map="auto",
)
prompt = "Rabbit is a helpful and honest assistant.\n\nUser: Who are you?\nRabbit:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=60, repetition_penalty=1.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
The
rtaforgeruntime package provides the compiled architecture. Source is not distributed.
Why SSM?
Transformers scale quadratically with context length because every token attends to every other token. SSMs replace attention with a fixed-size recurrent state: inference cost stays constant per token regardless of context length, VRAM footprint shrinks dramatically, and long-document throughput improves by orders of magnitude — all at the same parameter count.
Architecture
Rabbit is built on RtaSSM v7.2.2-FU "Fortress Unbroken", a custom state-space model developed at RtaForge:
- No attention mechanism — purely recurrent SSM layers with learned state dynamics
- 64 layers, 2560 hidden dimensions, 2.7B parameters, bfloat16
- Constitutional training — Gurukul curriculum with wiki pretraining → instruct SFT → persona imprint
- Vocabulary 50,280 tokens (GPT-NeoX tokenizer)
Training
| Stage | Data | Notes |
|---|---|---|
| Wiki warmup (v0.55) | Wikipedia (en) | 700 constitutional proposals via Gurukul — complete |
| Epoch 2 (planned) | RedPajama | Gate-only, ~3,350 proposals |
| Instruct SFT (planned) | ChatML instruction pairs | gate_only trainable strategy |
| Persona imprint (planned) | Rabbit constitutional corpus | Identity and value alignment |
Evaluation Access
Weights are publicly available. Runtime package is live:
pip install rtaforge
To evaluate Rabbit or discuss deployment: 📧 guha@rtaforge.in 🌐 rtaforge.in
Runtime documentation coming soon.
Maturity and Roadmap
v0.55 is a base pretrained checkpoint — Wikipedia warmup complete, CE ratio 0.993×.
Usable conversational behaviour is targeted at v0.8–v0.9, currently in training.
- Evaluating for deployment? Wait for v0.9.
- Evaluating the architecture or training methodology? v0.55-base is exactly what you need.
Limitations
v0.55 has not been evaluated on standard benchmarks. She is small, she is new, and she is learning. Feedback welcome at guha@rtaforge.in.
Citation
@misc{anvaya-rabbit-2026,
title = {Anvaya-Rabbit: A Sovereign SSM Language Model},
author = {RtaForge},
year = {2026},
url = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B}
}
Anvaya (अन्वय) — logical connection, coherence. Rabbit — the fast runner.
- Downloads last month
- 237