merge-crew (Merge Crew)

posted an update 1 day ago

Post

232

Big update to llm-datasets, my curated list of datasets and tools for post-training LLMs.

> Added many new datasets
> New "thinking" column
> Refreshed recommended tools.

Thanks to everyone who told me they used it for their research at ICLR, you motivated this update!

1 reply

·

KennethEnevoldsen

authored a paper 2 months ago

MAEB: Massive Audio Embedding Benchmark

Paper • 2602.16008 • Published Feb 17 • 23

mlabonne

authored 2 papers 3 months ago

LFM2 Technical Report

Paper • 2511.23404 • Published Nov 28, 2025 • 56

Zero-Overhead Introspection for Adaptive Test-Time Compute

Paper • 2512.01457 • Published Dec 1, 2025 • 2

timpal0l

authored 5 papers 4 months ago

The Nordic Pile: A 1.2TB Nordic Dataset for Language Modeling

Paper • 2303.17183 • Published Mar 30, 2023 • 1

GPT-SW3: An Autoregressive Language Model for the Nordic Languages

Paper • 2305.12987 • Published May 22, 2023

Why Not Simply Translate? A First Swedish Evaluation Benchmark for Semantic Similarity

Paper • 2009.03116 • Published Sep 7, 2020

Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?

Paper • 2104.10441 • Published Apr 21, 2021

SWEb: A Large Web Dataset for the Scandinavian Languages

Paper • 2410.04456 • Published Oct 6, 2024 • 1

mlabonne

posted an update 4 months ago

Post

10315

New family of 1B models just dropped!

> LiquidAI/LFM2.5-1.2B-Base: 10T → 28T tokens
> LiquidAI/LFM2.5-1.2B-Instruct: new large-scale multi-stage RL
> LiquidAI/LFM2.5-1.2B-JP: our most polite model
> LiquidAI/LFM2.5-VL-1.6B: multi-image multilingual
> LiquidAI/LFM2.5-Audio-1.5B: 8x times faster, no quality loss

Super proud of this release 🤗

3 replies

·

KennethEnevoldsen

authored a paper 7 months ago

HUME: Measuring the Human-Model Performance Gap in Text Embedding Task

Paper • 2510.10062 • Published Oct 11, 2025 • 10

mlabonne

posted an update 7 months ago

Post

8429

LiquidAI/LFM2-8B-A1B just dropped!

8.3B params with only 1.5B active/token 🚀

> Quality ≈ 3–4B dense, yet faster than Qwen3-1.7B
> MoE designed to run on phones/laptops (llama.cpp / vLLM)
> Pre-trained on 12T tokens → strong math/code/IF

1 reply

·

mlabonne

posted an update 7 months ago

Post

3883

⚛️ New drop of tiny task-specific models!

Want to do data extraction, translation, RAG, tool use, or math on a Raspberry Pi? We got you covered! ✅

These tiny models were fine-tuned to perform narrow tasks extremely well, making them competitive with much larger models.

You can deploy them today on-device or even on GPUs for big data operations!

LiquidAI/liquid-nanos-68b98d898414dd94d4d5f99a

1 reply

·

mlabonne

posted an update 9 months ago

Post

6983

Liquid just released two 450M and 1.6B param VLMs!

They're super fast and leverage SigLIP2 NaFlex encoders to handle native resolutions without distortion. It's ideal for on-device deployment in constrained environments like phones.

It's available today on Hugging Face, with an inference and a fine-tuning Colab notebooks.

LiquidAI/LFM2-VL-450M
LiquidAI/LFM2-VL-1.6B

KennethEnevoldsen

authored a paper 9 months ago

Dynaword: From One-shot to Continuously Developed Datasets

Paper • 2508.02271 • Published Aug 4, 2025 • 15

mlabonne

posted an update 10 months ago

Post

5762

LiquidAI open-sources a new generation of edge LLMs! 🥳

Based on a new hybrid architecture, these 350M, 700M, and 1.2B models are both fast and performant, ideal for on-device deployment.

I recommend fine-tuning them to power your next edge application. We already provide Colab notebooks to guide you. More to come soon!

📝 Blog post: https://www.liquid.ai/blog/liquid-foundation-models-v2-our-second-series-of-generative-ai-models
🤗 Models: LiquidAI/lfm2-686d721927015b2ad73eaa38