Anvaya-Rabbit 2.7B

India's first sovereign SSM-based language model.

Non-transformer architecture. No attention mechanism. Constitutional training via Gurukul. 7 patents filed at IP India.

⚠️ Checkpoint Deprecation Notice

Checkpoint	Status	Notes
`Anvaya-Rabbit-2.7B-0.55-base.pt`	✅ CURRENT	Wikipedia warmup complete, CE 0.993x
Any prior checkpoint	⚠️ DEPRECATED	Do not use for inference

Prior checkpoints are retained for research transparency.
The current checkpoint reflects iterative refinement of the
ANVAYA RtaSSM architecture and training pipeline.

Always use the latest -base.pt for any downstream work.

What's in this repo

Tier	File	Use this when…
Base	`base/Anvaya-Rabbit-2.7B-0.55-base.pt`	You want raw pretrained weights for your own fine-tuning

Instruct and Imprint tiers are in preparation (epoch 2 → SFT → imprint pipeline).

Quickstart

pip install rtaforge transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
tokenizer.add_special_tokens({"additional_special_tokens": ["<|im_start|>", "<|im_end|>"]})

model = AutoModelForCausalLM.from_pretrained(
    "RtaForge/Anvaya-Rabbit-2.7B",
    trust_remote_code=True,
    torch_dtype="bfloat16",
    device_map="auto",
)

prompt = "Rabbit is a helpful and honest assistant.\n\nUser: Who are you?\nRabbit:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=60, repetition_penalty=1.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

The rtaforge runtime package provides the compiled architecture. Source is not distributed.

Why SSM?

Transformers scale quadratically with context length because every token attends to every other token. SSMs replace attention with a fixed-size recurrent state: inference cost stays constant per token regardless of context length, VRAM footprint shrinks dramatically, and long-document throughput improves by orders of magnitude — all at the same parameter count.

Architecture

Rabbit is built on RtaSSM v7.2.2-FU "Fortress Unbroken", a custom state-space model developed at RtaForge:

No attention mechanism — purely recurrent SSM layers with learned state dynamics
64 layers, 2560 hidden dimensions, 2.7B parameters, bfloat16
Constitutional training — Gurukul curriculum with wiki pretraining → instruct SFT → persona imprint
Vocabulary 50,280 tokens (GPT-NeoX tokenizer)

Training

Stage	Data	Notes
Wiki warmup (v0.55)	Wikipedia (en)	700 constitutional proposals via Gurukul — complete
Epoch 2 (planned)	RedPajama	Gate-only, ~3,350 proposals
Instruct SFT (planned)	ChatML instruction pairs	`gate_only` trainable strategy
Persona imprint (planned)	Rabbit constitutional corpus	Identity and value alignment

Evaluation Access

Weights are publicly available. Runtime package is live:

pip install rtaforge

To evaluate Rabbit or discuss deployment: 📧 guha@rtaforge.in 🌐 rtaforge.in

Runtime documentation coming soon.

Maturity and Roadmap

v0.55 is a base pretrained checkpoint — Wikipedia warmup complete, CE ratio 0.993×.
Usable conversational behaviour is targeted at v0.8–v0.9, currently in training.

Evaluating for deployment? Wait for v0.9.
Evaluating the architecture or training methodology? v0.55-base is exactly what you need.

Limitations

v0.55 has not been evaluated on standard benchmarks. She is small, she is new, and she is learning. Feedback welcome at guha@rtaforge.in.

Citation

@misc{anvaya-rabbit-2026,
  title  = {Anvaya-Rabbit: A Sovereign SSM Language Model},
  author = {RtaForge},
  year   = {2026},
  url    = {https://huggingface.co/RtaForge/Anvaya-Rabbit-2.7B}
}

Anvaya (अन्वय) — logical connection, coherence. Rabbit — the fast runner.

Downloads last month: 237