Deep-Dives / Reasoning & Test-Time Compute

Reasoning & Test-Time Compute

Chain of thought, self-consistency, tree/graph of thought, and the inference-time-scaling laws that govern them.

  1. Chain-of-Thought, Properly
    What CoT actually buys (serial compute, not introspection), faithfulness vs post-hoc rationalization, when it hurts, and structured vs free traces.
  2. Self-Consistency & Sampling
    Why sampling + majority vote works, the exact bias-amplification failure, the saturating returns curve, and how to spend the k budget.
  3. Tree & Graph of Thought
    Deliberate search over partial solutions, the multiplicative cost, and the load-bearing dependency on a partial-state scorer.
  4. Verifier-Guided Search
    Outcome vs process reward models steering best-of-N and beam search, reward hacking at inference time, and why the verifier is the product.
  5. Inference-Time Scaling
    Test-time compute as a second scaling axis, the difficulty-adaptive compute-optimal frontier, and where more thinking stops paying.
  6. When Reasoning Helps (and When It Burns Money)
    The synthesis decision rule — task class × verifiability × budget — an escalation ladder, the named money-burning patterns, and a do/don't list.