• APPROACH
  • RESEARCH
SIGN IN
Skip to content
Noumena Research
  • #0011researchresult

    Let the Speedrun Search Itself

    Eval-gated config-only autoresearch on the canonical super fp8 lane

    Mar 14, 2026
  • #0010researchresult

    Reproducing Canon, mHC, and Engram

    A research narrative: wrong starts, PhysicsLM4 alignment, and one real polysemy failure

    Mar 14, 2026
  • #0009systemsresult

    RDEP

    keeping sparse expert compute hot across a whole NVLink fabric

    Mar 14, 2026
  • #0008researchresult

    Do MoE Experts Need Different Learning Rates?

    Why Moonlet's old 15x expert-LR rule overshoots in bf16 AdamW

    Mar 14, 2026
  • #0007researchhypothesis

    The Atlas Hypothesis

    why output-only dashboards cannot name what pretraining built, and what a real receipt would have to measure

    Mar 14, 2026
  • #0006researchresult

    Super-4096

    Loss keeps improving while routing collapses under extreme sparsity

    Mar 14, 2026
  • #0005researchresult

    NVFP4 Dynamics

    Why our NVFP4 recipe lagged BF16, and what actually closed almost all of the gap

    Mar 14, 2026
  • #0004methodologyresult

    What Are We Holding Fixed?

    Dense-vs-MoE comparison depends on the fairness contract; a failed `#420` transfer exposed the real problem

    Mar 14, 2026
  • #0003methodologyresult

    The Speedrun Loop

    A small-model speedrun is our fastest honest instrument for architecture research

    Mar 14, 2026
  • #0002methodologyresult

    Make It Measurable

    What to track when loss isn't enough

    Mar 14, 2026
  • #0001systemsframing

    What We Built

    A production-grade MoE training system, because reproducibility is the experiment

    Mar 14, 2026
  • #0000researchframing

    Why Training MoEs is So Hard

    Three failure modes that make frontier MoE training qualitatively different

    Mar 14, 2026

Series

all postsresearchmethodologysystems
Noumena
© 2026
AI Engineered for mastery.
  • Home
  • Research