Verifiable Process Rewards for Agentic Reasoning
AI & ML interests
None defined yet.
Recent Activity
Papers
SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models
MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
š Accepted by ICLR 2026
-
MARS: Reinforcing Multi-Agent Reasoning of LLMs through Self-Play in Strategic Games
Paper ⢠2510.15414 ⢠Published ⢠1 -
nics-efc/MARSHAL-Generalist-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠29 -
nics-efc/MARSHAL-Generalist-Qwen3-8B
Text Generation ⢠8B ⢠Updated ⢠10 -
nics-efc/MARSHAL-Tic-Tac-Toe-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠6
Verifiable Process Rewards for Agentic Reasoning
MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
š Accepted by ICLR 2026
-
MARS: Reinforcing Multi-Agent Reasoning of LLMs through Self-Play in Strategic Games
Paper ⢠2510.15414 ⢠Published ⢠1 -
nics-efc/MARSHAL-Generalist-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠29 -
nics-efc/MARSHAL-Generalist-Qwen3-8B
Text Generation ⢠8B ⢠Updated ⢠10 -
nics-efc/MARSHAL-Tic-Tac-Toe-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠6
models 14
nics-efc/VPR-Tic-Tac-Toe
Text Generation ⢠4B ⢠Updated
nics-efc/VPR-Sudoku
Text Generation ⢠4B ⢠Updated
nics-efc/VPR-Minesweeper
Text Generation ⢠4B ⢠Updated
nics-efc/MARSHAL-Mini-Hanabi-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠6
nics-efc/MARSHAL-Kuhn-Poker-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠61
nics-efc/MARSHAL-Tic-Tac-Toe-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠6
nics-efc/MARSHAL-Generalist-Qwen3-8B
Text Generation ⢠8B ⢠Updated ⢠10
nics-efc/MARSHAL-Generalist-Qwen3-4B
Text Generation ⢠4B ⢠Updated ⢠29
nics-efc/R2R_router_collections
Text Classification ⢠Updated ⢠1
nics-efc/Standard-1.7B
Text Generation ⢠2B ⢠Updated ⢠2
datasets 8
nics-efc/R2R_Router_Training_Qwen3-0.6B_Qwen3-30B-A3B
Viewer ⢠Updated ⢠9.3M ⢠24
nics-efc/R2R_Router_Training_Qwen3-4B_Qwen3-32B
Viewer ⢠Updated ⢠18.3M ⢠174
nics-efc/R2R_Router_Training_Qwen3-1.7B_Qwen3-8B
Viewer ⢠Updated ⢠21.9M ⢠170
nics-efc/R2R_Router_Training_Qwen3-0.6B_Qwen3-8B
Viewer ⢠Updated ⢠22.2M ⢠50
nics-efc/R2R_query
Viewer ⢠Updated ⢠2.93k ⢠32
nics-efc/R2R_Router_Training
Viewer ⢠Updated ⢠8.19M ⢠244 ⢠4
nics-efc/MoA_Long_HumanQA
Viewer ⢠Updated ⢠3.5k ⢠23 ⢠4
nics-efc/MoA_Long_Retrieval
Viewer ⢠Updated ⢠4.4k ⢠9 ⢠4