Submitted by Qi HU 1 SABER: Benchmarking Operational Safety of LLM Coding Agents in Stateful Project Workspaces sssr-lab 1 3