DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints Paper • 2601.18137 • Published Jan 26 • 28
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints Paper • 2601.18137 • Published Jan 26 • 28
ToolRM Collection ToolRM: Towards Agentic Tool-Use Reward Modeling • 4 items • Updated about 4 hours ago • 4
ToolRM Collection ToolRM: Towards Agentic Tool-Use Reward Modeling • 4 items • Updated about 4 hours ago • 4
CoEvol: Constructing Better Responses for Instruction Finetuning through Multi-Agent Cooperation Paper • 2406.07054 • Published Jun 11, 2024
One Model to Critique Them All: Rewarding Agentic Tool-Use via Efficient Reasoning Paper • 2510.26167 • Published Oct 30, 2025 • 3
ToolRM Collection ToolRM: Towards Agentic Tool-Use Reward Modeling • 4 items • Updated about 4 hours ago • 4