Paper Review: Group Sequence Policy Optimization
04 August 2025
My review of the paper Group Sequence Policy Optimization
Data science, career and other topics
04 August 2025
My review of the paper Group Sequence Policy Optimization
28 July 2025
My review of the paper Subliminal Learning Language models transmit behavioral traits via hidden signals in data
30 June 2025
My review of the paper ProRL Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
23 June 2025
My review of the paper V-JEPA 2 Self-Supervised Video Models Enable Understanding, Prediction and Planning
09 June 2025
My review of the paper Beyond the 80/20 Rule High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
02 June 2025
My review of the paper SWE-rebench An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents