BadCat
Foresta
ยท
AI & ML interests
LLMs
Deep learning
Reinforcement learning
Recent Activity
upvoted a paper about 2 months ago
AT^2PO: Agentic Turn-based Policy Optimization via Tree Search upvoted a paper about 2 months ago
Evaluating Parameter Efficient Methods for RLVR upvoted a paper 5 months ago
Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning? Organizations
None yet