Deep-Dives / Reasoning & Test-Time Compute

Reasoning & Test-Time Compute

Chain of thought, self-consistency, tree/graph of thought, and the inference-time-scaling laws that govern them.

Chain-of-Thought, Properly

What CoT actually buys (serial compute, not introspection), faithfulness vs post-hoc rationalization, when it hurts, and structured vs free traces.
Self-Consistency & Sampling

Why sampling + majority vote works, the exact bias-amplification failure, the saturating returns curve, and how to spend the k budget.
Tree & Graph of Thought

Deliberate search over partial solutions, the multiplicative cost, and the load-bearing dependency on a partial-state scorer.
Verifier-Guided Search

Outcome vs process reward models steering best-of-N and beam search, reward hacking at inference time, and why the verifier is the product.
Inference-Time Scaling

Test-time compute as a second scaling axis, the difficulty-adaptive compute-optimal frontier, and where more thinking stops paying.
When Reasoning Helps (and When It Burns Money)

The synthesis decision rule — task class × verifiability × budget — an escalation ladder, the named money-burning patterns, and a do/don't list.