Tag: rl
- Paper Review: Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning (01 Sep 2025)
- Paper Review: Group Sequence Policy Optimization (04 Aug 2025)
- Paper Review: ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models (30 Jun 2025)
- Paper Review: Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning (09 Jun 2025)
- Paper Review: Visual Planning: Lets Think Only with Images (26 May 2025)
- Paper Review: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (27 Jan 2025)