1 101 2

hangyu guo

Rosiness

AI & ML interests

Natural Language Processing

Recent Activity

upvoted a paper 1 day ago

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

updated a dataset 2 days ago

MM-R1-HH/envs_supply

published a dataset 2 days ago

MM-R1-HH/envs_supply

View all activity

Organizations

upvoted a paper 1 day ago

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

Paper • 2604.18292 • Published 2 days ago • 66

updated a dataset 2 days ago

MM-R1-HH/envs_supply

Preview • Updated 2 days ago • 20

published a dataset 2 days ago

MM-R1-HH/envs_supply

Preview • Updated 2 days ago • 20

updated a dataset 4 days ago

MM-R1-HH/C_25_07_01_test_0419

Updated 4 days ago • 21

published a dataset 4 days ago

MM-R1-HH/C_25_07_01_test_0419

Updated 4 days ago • 21

upvoted 3 papers 6 days ago

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published 7 days ago • 149

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models

Paper • 2604.10866 • Published 9 days ago • 62

GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents

Paper • 2604.07429 • Published 14 days ago • 113

upvoted a paper 7 days ago

Towards Long-horizon Agentic Multimodal Search

Paper • 2604.12890 • Published 8 days ago • 20

upvoted a paper 23 days ago

Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills

Paper • 2603.25158 • Published 27 days ago • 51

upvoted 2 papers 29 days ago

On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation

Paper • 2603.22117 • Published 30 days ago • 29

WorldCache: Content-Aware Caching for Accelerated Video World Models

Paper • 2603.22286 • Published 29 days ago • 4

upvoted 3 papers about 1 month ago

authored a paper about 1 month ago

WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics

Paper • 2603.13391 • Published Mar 11 • 19

upvoted 2 papers about 1 month ago

\$OneMillion-Bench: How Far are Language Agents from Human Experts?

Paper • 2603.07980 • Published Mar 9 • 27

Proact-VL: A Proactive VideoLLM for Real-Time AI Companions

Paper • 2603.03447 • Published Mar 3 • 37

upvoted 2 papers about 2 months ago

AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios

Paper • 2602.23166 • Published Feb 26 • 45

How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities

Paper • 2603.02578 • Published Mar 3 • 25

hangyu guo

AI & ML interests

Recent Activity

Organizations

Rosiness's activity