Tag: dpo

Tag: dpo

2 posts

Paper Review: Diffusion Model Alignment Using Direct Preference Optimization

Adapting DPO from language models to image generation — training Stable Diffusion XL on 851K human preferences to sig...

paperreview deeplearning cv stablediffusion

Paper Review: Zephyr: Direct Distillation of LM Alignment

My review of the paper Zephyr Direct Distillation of LM Alignment

paperreview deeplearning nlp llm