Research | Noumena Network

#0011researchresult
Let the Speedrun Search Itself
Eval-gated config-only autoresearch on the canonical super fp8 lane
Mar 14, 2026
#0010researchresult
Reproducing Canon, mHC, and Engram
A research narrative: wrong starts, PhysicsLM4 alignment, and one real polysemy failure
Mar 14, 2026
#0009systemsresult
RDEP
keeping sparse expert compute hot across a whole NVLink fabric
Mar 14, 2026
#0008researchresult
Do MoE Experts Need Different Learning Rates?
Why Moonlet's old 15x expert-LR rule overshoots in bf16 AdamW
Mar 14, 2026
#0007researchhypothesis
The Atlas Hypothesis
why output-only dashboards cannot name what pretraining built, and what a real receipt would have to measure
Mar 14, 2026
#0006researchresult
Super-4096
Loss keeps improving while routing collapses under extreme sparsity
Mar 14, 2026
#0005researchresult
NVFP4 Dynamics
Why our NVFP4 recipe lagged BF16, and what actually closed almost all of the gap
Mar 14, 2026
#0004methodologyresult
What Are We Holding Fixed?
Dense-vs-MoE comparison depends on the fairness contract; a failed `#420` transfer exposed the real problem
Mar 14, 2026
#0003methodologyresult
The Speedrun Loop
A small-model speedrun is our fastest honest instrument for architecture research
Mar 14, 2026
#0002methodologyresult
Make It Measurable
What to track when loss isn't enough
Mar 14, 2026
#0001systemsframing
What We Built
A production-grade MoE training system, because reproducibility is the experiment
Mar 14, 2026
#0000researchframing
Why Training MoEs is So Hard
Three failure modes that make frontier MoE training qualitatively different
Mar 14, 2026