Sebastian Gabarain's picture

Sebastian Gabarain

Locutusque

AI & ML interests

Pushing performance in small language models

Recent Activity

liked a dataset about 22 hours ago
Locutusque/esmeralda-agentic
posted an update about 22 hours ago
🚀 Introducing Esmeralda-Llama-3.1-8B-control The first release in the Esmeralda model family by Locutusque. This model is intentionally small and experimental — a control/baseline proof-of-concept designed to answer one question: «“How strong is my new "Locutusque/esmeralda-agentic" dataset before scaling to larger runs?”» Training Details - Base: Llama 3.1 8B - Training precision: bf16 mixed precision - Chat template: modified ChatML - Dataset size: ~37k examples - Examples actually used for this run: ~5k The dataset includes: - multi-turn agentic traces - reasoning traces - structured assistant behavior - generalist instruction data Benchmark Results Compared against: - Llama 3.1 8B Instruct - Hermes-3-Llama-3.1-8B HumanEval 57.3 — Esmeralda 56.1 — Llama 3.1 Instruct 52.4 — Hermes-3 MBPP 53.2 — Esmeralda 56.8 — Llama 3.1 Instruct 48.2 — Hermes-3 GPQA Diamond 15.7 — Esmeralda 15.7 — Llama 3.1 Instruct 18.2 — Hermes-3 EQ-Bench 59.2 — Esmeralda 61.1 — Llama 3.1 Instruct 63.1 — Hermes-3 EQ-Bench Parseable (Syntax Stability) 🔥 100.0% — Esmeralda 92.4% — Llama 3.1 Instruct 91.2% — Hermes-3 Here Be Dragons 🐉 I also experimented with a new TruthfulQA free-generation evaluation setup. - Responses were judged by Gemma 4 26B A4B - The judge compared generations directly against ground-truth answers - Models were evaluated in 8-bit quantized form to speed up inference TruthfulQA (LLM Judge) 0.682 — Esmeralda-Llama-3.1-8B-control 0.587 — Hermes-3-Llama-3.1-8B (reported MC2 score; methodology differs) For a lightweight control run trained on only a fraction of the dataset, I’m pretty encouraged by the results. The model is released under the standard Llama 3.1 license, and I’d genuinely love feedback from people testing it in real workflows. Model: https://huggingface.co/Locutusque/Esmeralda-Llama-3.1-8B-control Dataset: https://huggingface.co/datasets/Locutusque/esmeralda-agentic
View all activity

Organizations

BigScience Biomedical Datasets's profile picture ZeroGPU Explorers's profile picture Aurora-M's profile picture The Hydra Project's profile picture fne's profile picture Social Post Explorers's profile picture M4-ai's profile picture Quasar Research's profile picture Hugging Face Discord Community's profile picture Data Tonic (Alignment Lab)'s profile picture Data Is Better Together Contributor's profile picture Dtnm's profile picture