π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows Paper • 2605.14678 • Published 20 days ago • 104
Running Agents 6 PEFT Method Comparison ⚖ 6 Explore PEFT methods with interactive Pareto visualizations