| --- |
| language: |
| - en |
| license: apache-2.0 |
| tags: |
| - text-generation |
| - instruction-tuning |
| - multi-task |
| - reasoning |
| - email |
| - summarization |
| - chat |
| - peft |
| - lora |
| - qwen |
| - deepseek |
| base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B |
| datasets: |
| - HuggingFaceTB/smoltalk |
| - snoop2head/enron_aeslc_emails |
| - lucadiliello/STORIES |
| - abisee/cnn_dailymail |
| - wiki40b |
| model_type: causal-lm |
| inference: true |
| library_name: peft |
| pipeline_tag: text-generation |
| --- |
| |
| # 🧠 Deepseek-R1-multitask-lora |
|
|
| **Author:** Gilbert Akham |
| **License:** Apache-2.0 |
| **Base model:** [`deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) |
| **Adapter type:** LoRA (PEFT) |
| **Capabilities:** Multi-task generalization & reasoning |
|
|
| --- |
|
|
| # 🚀 What It Can Do |
|
|
| This multitask fine-tuned model handles a broad set of natural language and reasoning-based tasks, such as: |
|
|
| ✉️ Email & message writing — generate clear, friendly, or professional communications. |
|
|
| 📖 Story & creative writing — craft imaginative narratives, poems, and dialogues. |
|
|
| 💬 Conversational chat — maintain coherent, context-aware conversations. |
|
|
| 💡 Explanations & tutoring — explain technical or abstract topics simply. |
|
|
| 🧩 Reasoning & logic tasks — provide step-by-step answers for analytical questions. |
|
|
| 💻 Code generation & explanation — write and explain Python or general programming code. |
|
|
| 🌍 Translation & summarization — translate between multiple languages or condense information. |
|
|
| The model’s multi-domain training (based on datasets like SmolTalk, Everyday Conversations, and reasoning-rich samples) makes it suitable for assistants, chatbots, content generators, or educational tools. |
| --- |
|
|
| ## 🧩 Training Details |
|
|
| | Parameter | Value | |
| |------------|-------| |
| | Base model | `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` | |
| | Adapter | LoRA (r=8, alpha=32, dropout=0.1) | |
| | Max sequence length | 1024 | |
| | Learning rate | 3e-5 (cosine decay) | |
| | Optimizer | `adamw_8bit` | |
| | Grad Accumulation | 4 | |
| | Precision | 4-bit quantized, FP16 compute | |
| | Steps | 12k total (best @ ~8.2k) | |
| | Training time | ~2.5h on A4000 | |
| | Frameworks | 🤗 Transformers, PEFT, TRL, BitsAndBytes | |
|
|
| --- |
|
|
| ## 🧠 Reasoning Capability |
|
|
| Thanks to integration of **SmolTalk** and diverse multi-task prompts, the model learns: |
| - **Chain-of-thought style reasoning** |
| - **Conversational grounding** |
| - **Multi-step logical inferences** |
| - **Instruction following** across domains |
|
|
| Example: |
| ```text |
| ### Task: Explain reasoning |
| |
| ### Input: |
| If a train leaves City A at 3 PM and arrives at City B at 6 PM, covering 180 km, what is its average speed? |
| |
| ### Output: |
| The train travels 180 km in 3 hours. |
| Average speed = 180 ÷ 3 = 60 km/h. |
| |