meituan/PIE_bench
Preview
•
Updated
•
631
•
1
None defined yet.
TRIP-Bench: A Benchmark for Long-Horizon Interactive Agents in Real-World Scenarios
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience