Md Raihanul Haque Rahi's picture

3 4

Md Raihanul Haque Rahi

ryuma007

·

raihanulhaque

AI & ML interests

Computer Vision, LLM, Agent

Recent Activity

reacted to SeaWolf-AI's post with 🔥 2 days ago

Darwin-TTS: 3% of an LLM's Brain Makes TTS Speak with Emotion — Zero Training We blended 3% of Qwen3-1.7B (LLM) FFN weights into Qwen3-TTS-1.7B's talker module. The result: emotionally enhanced speech synthesis — with zero training, zero data, and zero GPU hours. Try the Demo: https://huggingface.co/spaces/FINAL-Bench/Darwin-TTS-1.7B-Cross Model Weights: https://huggingface.co/FINAL-Bench/Darwin-TTS-1.7B-Cross Full Research Article: https://huggingface.co/blog/FINAL-Bench/darwin-tts Qwen3-1.7B (LLM) and Qwen3-TTS-1.7B's talker share 100% identical architecture — same hidden_size (2048), same layers (28), same heads (16). This enabled pure 1:1 weight blending across 84 FFN tensors with a single lerp operation. At 3% blend, emotion appears. At 5%, emotion intensifies. At 10%, the model breaks — producing 655-second outputs for a 3-second sentence, because the LLM's "keep generating" pattern overwhelms the TTS stop signal. To our knowledge, this is the first training-free cross-modal weight transfer between an LLM and a TTS model. Prior work either requires adapter training (SmolTolk, 2025), fine-tuning (CSLM, 2025), or massive end-to-end compute (GPT-4o). Darwin-TTS achieves cross-modal capability transfer in under 2 minutes on CPU. The key insight: TTS models with LLM backbones already "think" in language. We're just restoring 3% of the original LLM's language understanding patterns — particularly those related to emotional semantics and prosody planning. The code is three lines: load the model, load the LLM FFN, call p.lerp_(llm_weight, 0.03). creators of the Darwin Evolutionary Merge Framework. Darwin LLM V7 achieved GPQA Diamond 86.9% (HF Benchmark #3) through CMA-ES optimized FFN crossbreeding. Darwin-TTS extends this principle from LLM-to-LLM merging into cross-modal LLM-to-TTS transfer. Apache 2.0.

updated a model about 1 month ago

ryuma007/qwen3.5-4B-manim-finetune

published a model about 1 month ago

ryuma007/qwen3.5-4B-manim-finetune

View all activity

Organizations

liked a dataset about 1 month ago

BibbyResearch/3blue1brown-manim

Viewer • Updated Sep 26, 2025 • 2.41k • 80 • 23

liked a model 3 months ago

IbrahimSalah/Arabic-F5-TTS-v2

Text-to-Speech • Updated Nov 13, 2025 • 31

liked a dataset about 1 year ago

canopylabs/zac-sample-dataset

Viewer • Updated Mar 8, 2025 • 20 • 27 • 28

liked a model over 1 year ago

meta-llama/Llama-3.2-1B

Text Generation • 1B • Updated Oct 24, 2024 • 1.33M • 2.37k