Large Language Models
Beyond Reward Engineering: A Data Recipe for Long-Context Reinforcement Learning
Rethinking the Role of Efficient Attention in Hybrid Architectures