Small multilingual LLMs for annotating and curating LLM training data.
AI & ML interests
Open, Multilingual, European, Generative, Foundational LLM
Recent Activity
View all activity
Organization Card
Europe's leading AI companies and research institutions combine their forces and expertise to develop next-generation open-source language models in an unprecedented collaboration to advance European AI capabilities, the OpenEuroLLM project
models 13
openeurollm/OLMo-3-7B-Think-SFT
Updated
openeurollm/OLMo-3-7B-Instruct-SFT
Updated
openeurollm/tokenizer-256k
Updated • 1
openeurollm/tokenizer-128k
Updated
openeurollm/datamix-2b-80-20
Updated • 97
openeurollm/datamix-2b-50-50
Updated • 4
openeurollm/datamix-2b-60-40
Updated • 37
openeurollm/datamix-2b-70-30
Updated • 30
openeurollm/datamix-2b-90-10
Updated • 32
openeurollm/datamix-2b-en-80pct-DPO-HelpSteer3-16k
Text Generation • 2B • Updated • 1
datasets 18
openeurollm/lmsys-chat-1m-decontaminated
Updated
openeurollm/orca-agentinstruct-1M-v1-decontaminated
Updated
openeurollm/open-perfectblend-decontaminated
Updated
openeurollm/smoltalk2-decontaminated
Updated
openeurollm/Nemotron-Post-Training-Dataset-v2-decontaminated
Updated
openeurollm/Dolci-Think-SFT-32B-decontaminated
Viewer • Updated • 2.25M • 38
openeurollm/Dolci-Think-SFT-7B-decontaminated
Viewer • Updated • 2.27M • 92
openeurollm/Dolci-Instruct-SFT-decontaminated
Viewer • Updated • 2.15M • 33
openeurollm/propella-annotations
Viewer • Updated • 3.17B • 2.7k • 14
openeurollm/dolci-think-sft-tokenized
Updated • 14