Paper Review: Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
09 June 2025
Only ~20% of tokens actually matter when training LLMs to reason with RL. Updating the low-entropy majority actively hurts performance — a finding that challenges standard RLVR practice.