-
Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories
Paper • 2606.02060 • Published • 50 -
MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?
Paper • 2606.01993 • Published • 13 -
NJU-LINK/DR3-Eval
Viewer • Updated • 100 • 2.09k • 2 -
TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation
Paper • 2606.02320 • Published • 13
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?
TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation
-
Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories
Paper • 2606.02060 • Published • 50 -
MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?
Paper • 2606.01993 • Published • 13 -
NJU-LINK/DR3-Eval
Viewer • Updated • 100 • 2.09k • 2 -
TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation
Paper • 2606.02320 • Published • 13
datasets 15
NJU-LINK/OmniCap-IF
Viewer • Updated • 480 • 29
NJU-LINK/OmniCap-IF-54K
Viewer • Updated • 53.9k • 66
NJU-LINK/AVSCapBench
Viewer • Updated • 1.23k • 807
NJU-LINK/TELBench
Updated • 157 • 1
NJU-LINK/TVIR-Bench
Viewer • Updated • 100 • 74
NJU-LINK/CoVEBench
Viewer • Updated • 626 • 463 • 1
NJU-LINK/WebCompass
Viewer • Updated • 933 • 11.9k • 6
NJU-LINK/ViDiC-1K
Updated • 322 • 5
NJU-LINK/DR3-Eval
Viewer • Updated • 100 • 2.09k • 2
NJU-LINK/CodeTraceBench
Viewer • Updated • 4.32k • 3.12k • 3