Shidong Cao

A short note on viewing chain-of-thought reasoning as structured search over intermediate states.

2026-05-13

Diffusion Sampling: A Compact Reminder

A compact reminder of the reverse diffusion process and the role of denoising steps.

2026-05-12

RLHF Reading Checklist

A checklist for reading RLHF papers with attention to data, reward modeling, and optimization details.

2026-05-11