Spiral RL

community

https://github.com/spiral-rl/spiral

AI & ML interests

None defined yet.

Recent Activity

Benjamin-eecs authored a paper 6 days ago

From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space

Benjamin-eecs authored a paper about 1 month ago

Reasoning over mathematical objects: on-policy reward modeling and test time aggregation

Benjamin-eecs authored a paper 6 months ago

Scaling Agent Learning via Experience Synthesis

View all activity

authored a paper 6 days ago

From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space

Paper • 2604.14142 • Published 12 days ago • 29

authored a paper about 1 month ago

Reasoning over mathematical objects: on-policy reward modeling and test time aggregation

Paper • 2603.18886 • Published Mar 19 • 6

authored a paper 6 months ago

Scaling Agent Learning via Experience Synthesis

Paper • 2511.03773 • Published Nov 5, 2025 • 83

updated a model 6 months ago

spiral-rl/Spiral-Octothinker-8B-Multi-Env

Text Generation • 8B • Updated Nov 6, 2025 • 7

updated a collection 6 months ago

SPIRAL

8 items • Updated Nov 6, 2025 • 2

published a model 6 months ago

spiral-rl/Spiral-Octothinker-8B-Multi-Env

Text Generation • 8B • Updated Nov 6, 2025 • 7

updated a model 6 months ago

spiral-rl/Spiral-Llama3-8B-Multi-Env

Text Generation • 8B • Updated Nov 6, 2025 • 6

published a model 6 months ago

spiral-rl/Spiral-Llama3-8B-Multi-Env

Text Generation • 8B • Updated Nov 6, 2025 • 6

updated a model 6 months ago

spiral-rl/Spiral-Qwen3-8B-Multi-Env

Text Generation • 8B • Updated Nov 6, 2025 • 10 • 1

published a model 6 months ago

spiral-rl/Spiral-Qwen3-8B-Multi-Env

Text Generation • 8B • Updated Nov 6, 2025 • 10 • 1

updated a model 6 months ago

spiral-rl/Spiral-Qwen3-4B-Multi-Env

Text Generation • 4B • Updated Nov 6, 2025 • 7

published a model 6 months ago

spiral-rl/Spiral-Qwen3-4B-Multi-Env

Text Generation • 4B • Updated Nov 6, 2025 • 7

authored a paper 6 months ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28, 2025 • 18

authored a paper 6 months ago

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

Paper • 2510.01171 • Published Oct 1, 2025 • 19

authored 2 papers 7 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9, 2025 • 39

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 277

authored a paper 7 months ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 91