Skip to content
Mobrief
Research

Academic or research source. Check the methodology, sample size, and whether it's been replicated.

Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training

Training on verifiable symbolic data is a promising way to expand the reasoning frontier of language models beyond what standard pre-training corpora provide. Yet existing procedural generators...

2-Minute Brief
  • According to arXiv cs.CL: Training on verifiable symbolic data is a promising way to expand the reasoning frontier of language models beyond what standard pre-training corpora provide. Yet existing procedural generators often rely on fixed puzzles or templates and do not deliver the distributional breadth needed at scale. We introduce Reasoning Core, a scalable suite that procedurally generates verifiable symbolic reasoning data across core formal domains: PDDL planning over randomized domains, first-order logic with equ
Read Original

Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training

TLDR

Training on verifiable symbolic data is a promising way to expand the reasoning frontier of language models beyond what standard pre-training corpora provide. Yet existing procedural generators...

Artifacts
Paper PDF
2-Minute Brief
  • According to arXiv cs.CL: Training on verifiable symbolic data is a promising way to expand the reasoning frontier of language models beyond what standard pre-training corpora provide. Yet existing procedural generators often rely on fixed puzzles or templates and do not deliver the distributional breadth needed at scale. We introduce Reasoning Core, a scalable suite that procedurally generates verifiable symbolic reasoning data across core formal domains: PDDL planning over randomized domains, first-order logic with equ
Open
O open S save B back M mode