Deep-Dives / Reasoning & Test-Time Compute
Reasoning & Test-Time Compute
Chain of thought, self-consistency, tree/graph of thought, and the inference-time-scaling laws that govern them.
- Chain-of-Thought, ProperlyWhat CoT actually buys (serial compute, not introspection), faithfulness vs post-hoc rationalization, when it hurts, and structured vs free traces.
- Self-Consistency & SamplingWhy sampling + majority vote works, the exact bias-amplification failure, the saturating returns curve, and how to spend the k budget.
- Tree & Graph of ThoughtDeliberate search over partial solutions, the multiplicative cost, and the load-bearing dependency on a partial-state scorer.
- Verifier-Guided SearchOutcome vs process reward models steering best-of-N and beam search, reward hacking at inference time, and why the verifier is the product.
- Inference-Time ScalingTest-time compute as a second scaling axis, the difficulty-adaptive compute-optimal frontier, and where more thinking stops paying.
- When Reasoning Helps (and When It Burns Money)The synthesis decision rule — task class × verifiability × budget — an escalation ladder, the named money-burning patterns, and a do/don't list.