Floating Bytes

Deep Dive into Fine Tuning a LoRA Reranker on Phi-3

May 09, 2026 1 min read

I fine tuned Phi-3 as a pairwise reranker with LoRA and logged every gradient. Early layers changed 200x more than late layers, but ranking representations o...

Text Diffusion: The Idea Hiding Inside BERT All Along

April 06, 2026 1 min read

Auto regressive generation is sequential and diffusion uses much fewer passes in text generation.

Cross Entropy Loss Connection to GPT Models

February 26, 2026 1 min read

Cross-entropy loss isn’t a heuristic, it is maximum likelihood estimation with a sign flip. It also shows how the same math powers GPT training.

RLHF vs RLAIF vs RLVR: The Three Ways to Teach AI Models

February 18, 2026 1 min read

Understanding the basics of RLHF vs RLAIF vs RLVR for AI feedback comparison

Manish Saraswat

Recent Posts

Deep Dive into Fine Tuning a LoRA Reranker on Phi-3

Text Diffusion: The Idea Hiding Inside BERT All Along

Cross Entropy Loss Connection to GPT Models

RLHF vs RLAIF vs RLVR: The Three Ways to Teach AI Models