Tiny Series Tiny datasets that empower the foundation of Small Language Model! nampdn-ai/tiny-textbooks Viewer • Updated Jul 3, 2024 • 420k • 904 • 178 nampdn-ai/tiny-codes Viewer • Updated Sep 30, 2023 • 1.63M • 1.72k • 289 nampdn-ai/tiny-strange-textbooks Viewer • Updated Feb 2, 2024 • 1M • 57 • 92 nampdn-ai/tiny-math-textbooks Viewer • Updated Jan 27, 2024 • 635k • 126 • 32
Mini Pretrain Datasets nampdn-ai/mini-fineweb Viewer • Updated Mar 4, 2025 • 291M • 216 • 25 nampdn-ai/mini-peS2o Viewer • Updated Feb 6, 2024 • 1.91M • 50 • 10 nampdn-ai/mini-pubmed Viewer • Updated Sep 8, 2023 • 17k • 22 • 5 nampdn-ai/mini-proofpile Viewer • Updated Sep 5, 2023 • 221k • 26 • 7
Tiny Series Tiny datasets that empower the foundation of Small Language Model! nampdn-ai/tiny-textbooks Viewer • Updated Jul 3, 2024 • 420k • 904 • 178 nampdn-ai/tiny-codes Viewer • Updated Sep 30, 2023 • 1.63M • 1.72k • 289 nampdn-ai/tiny-strange-textbooks Viewer • Updated Feb 2, 2024 • 1M • 57 • 92 nampdn-ai/tiny-math-textbooks Viewer • Updated Jan 27, 2024 • 635k • 126 • 32
Mini Pretrain Datasets nampdn-ai/mini-fineweb Viewer • Updated Mar 4, 2025 • 291M • 216 • 25 nampdn-ai/mini-peS2o Viewer • Updated Feb 6, 2024 • 1.91M • 50 • 10 nampdn-ai/mini-pubmed Viewer • Updated Sep 8, 2023 • 17k • 22 • 5 nampdn-ai/mini-proofpile Viewer • Updated Sep 5, 2023 • 221k • 26 • 7