LIMIT: Less Is More for Instruction Tuning Across Evaluation Paradigms Paper • 2311.13133 • Published Nov 22, 2023
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining Paper • 2312.17482 • Published Dec 29, 2023 • 1
Does your data spark joy? Performance gains from domain upsampling at the end of training Paper • 2406.03476 • Published Jun 5, 2024 • 4
Evaluating Very Long-Term Conversational Memory of LLM Agents Paper • 2402.17753 • Published Feb 27, 2024 • 19