Technical Notes
2026
LLM Reasoning Notes: Chain-of-Thought and Search
A short note on viewing chain-of-thought reasoning as structured search over intermediate states.
Diffusion Sampling: A Compact Reminder
A compact reminder of the reverse diffusion process and the role of denoising steps.
RLHF Reading Checklist
A checklist for reading RLHF papers with attention to data, reward modeling, and optimization details.