arxiv:2603.03756

MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier

Published on Mar 4

· Submitted by

Zonglin Yang on Mar 6

#1 Paper of the day

MiroMind AI

Upvote

Authors:

Zonglin Yang ,

Abstract

MOOSE-Star framework enables efficient training and inference for generative reasoning by addressing combinatorial complexity through decomposed subtasks, hierarchical search, and bounded composition.

AI-generated summary

While large language models (LLMs) show promise in scientific discovery, existing research focuses on inference or feedback-driven training, leaving the direct modeling of the generative reasoning process, P(hypothesis|background) (P(h|b)), unexplored. We demonstrate that directly training P(h|b) is mathematically intractable due to the combinatorial complexity (O(N^k)) inherent in retrieving and composing inspirations from a vast knowledge base. To break this barrier, we introduce MOOSE-Star, a unified framework enabling tractable training and scalable inference. In the best case, MOOSE-Star reduces complexity from exponential to logarithmic (O(log N)) by (1) training on decomposed subtasks derived from the probabilistic equation of discovery, (2) employing motivation-guided hierarchical search to enable logarithmic retrieval and prune irrelevant subspaces, and (3) utilizing bounded composition for robustness against retrieval noise. To facilitate this, we release TOMATO-Star, a dataset of 108,717 decomposed papers (38,400 GPU hours) for training. Furthermore, we show that while brute-force sampling hits a ''complexity wall,'' MOOSE-Star exhibits continuous test-time scaling.

View arXiv page View PDF GitHub 9 Add to collection

Community

ZonglinY

Paper author Paper submitter 1 day ago

Most current LLMs for scientific discovery rely on inference-time prompting or external feedback for training. But how can we directly train an LLM to generate scientific hypotheses from a research background, i.e., P(h|b)?

In this work, we theoretically demonstrate that directly training this generative process is computationally intractable due to the O(N^k) combinatorial complexity of retrieving and composing scientific inspirations.

To break this barrier, we introduce MOOSE-Star, a unified framework that reduces this complexity to O(log N) via Motivation-Guided Hierarchical Search and Bounded Composition.

🔥 Key Highlights & Open-Source Contributions:

Tractable & Scalable Training: The first framework to enable scalable training for the direct generation of scientific discoveries.
Superior Test-Time Scaling: While brute-force unguided sampling hits a "complexity wall" on multi-step problems, MOOSE-Star exhibits continuous test-time scaling for discovery.
TOMATO-Star Dataset 🍅: We are fully open-sourcing our data engine! It contains 108,717 open-access papers rigorously decomposed into (Background, Hypothesis, Inspirations) tuples, which cost ~38,400 A800 GPU hours to build.
Models & Code: We have released our fine-tuned R1-Distilled-7B models (IR and HC modules) and the complete training/inference pipeline.

We hope this opens up new tractable pathways for the AI4Science community!