Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
38.3
TFLOPS
3
4
Md Raihanul Haque Rahi
ryuma007
Follow
0 followers
·
10 following
raihanulhaque
AI & ML interests
Computer Vision, LLM, Agent
Recent Activity
reacted
to
SeaWolf-AI
's
post
with 🔥
2 days ago
Darwin-TTS: 3% of an LLM's Brain Makes TTS Speak with Emotion — Zero Training We blended 3% of Qwen3-1.7B (LLM) FFN weights into Qwen3-TTS-1.7B's talker module. The result: emotionally enhanced speech synthesis — with zero training, zero data, and zero GPU hours. Try the Demo: https://huggingface.co/spaces/FINAL-Bench/Darwin-TTS-1.7B-Cross Model Weights: https://huggingface.co/FINAL-Bench/Darwin-TTS-1.7B-Cross Full Research Article: https://huggingface.co/blog/FINAL-Bench/darwin-tts Qwen3-1.7B (LLM) and Qwen3-TTS-1.7B's talker share 100% identical architecture — same hidden_size (2048), same layers (28), same heads (16). This enabled pure 1:1 weight blending across 84 FFN tensors with a single lerp operation. At 3% blend, emotion appears. At 5%, emotion intensifies. At 10%, the model breaks — producing 655-second outputs for a 3-second sentence, because the LLM's "keep generating" pattern overwhelms the TTS stop signal. To our knowledge, this is the first training-free cross-modal weight transfer between an LLM and a TTS model. Prior work either requires adapter training (SmolTolk, 2025), fine-tuning (CSLM, 2025), or massive end-to-end compute (GPT-4o). Darwin-TTS achieves cross-modal capability transfer in under 2 minutes on CPU. The key insight: TTS models with LLM backbones already "think" in language. We're just restoring 3% of the original LLM's language understanding patterns — particularly those related to emotional semantics and prosody planning. The code is three lines: load the model, load the LLM FFN, call p.lerp_(llm_weight, 0.03). creators of the Darwin Evolutionary Merge Framework. Darwin LLM V7 achieved GPQA Diamond 86.9% (HF Benchmark #3) through CMA-ES optimized FFN crossbreeding. Darwin-TTS extends this principle from LLM-to-LLM merging into cross-modal LLM-to-TTS transfer. Apache 2.0.
updated
a model
about 1 month ago
ryuma007/qwen3.5-4B-manim-finetune
published
a model
about 1 month ago
ryuma007/qwen3.5-4B-manim-finetune
View all activity
Organizations
ryuma007
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a dataset
about 1 month ago
BibbyResearch/3blue1brown-manim
Viewer
•
Updated
Sep 26, 2025
•
2.41k
•
80
•
23
liked
a model
3 months ago
IbrahimSalah/Arabic-F5-TTS-v2
Text-to-Speech
•
Updated
Nov 13, 2025
•
31
liked
a dataset
about 1 year ago
canopylabs/zac-sample-dataset
Viewer
•
Updated
Mar 8, 2025
•
20
•
27
•
28
liked
a model
over 1 year ago
meta-llama/Llama-3.2-1B
Text Generation
•
1B
•
Updated
Oct 24, 2024
•
1.33M
•
2.37k