Paper Review: ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
30 June 2025
My review of the paper ProRL Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Data science, career and other topics
30 June 2025
My review of the paper ProRL Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
23 June 2025
A self-supervised video model trained on 1M+ hours of video that understands motion, anticipates actions, and — with just 62 hours of robot data — performs zero-shot robotic pick-and-place planning.
09 June 2025
Only ~20% of tokens actually matter when training LLMs to reason with RL. Updating the low-entropy majority actively hurts performance — a finding that challenges standard RLVR practice.
02 June 2025
My review of the paper SWE-rebench An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
26 May 2025
My review of the paper Visual Planning Let's Think Only with Images
15 May 2025
DeepMind's autonomous coding agent that evolves algorithms through LLM-driven iteration — it discovered the first improvement to matrix multiplication over Strassen's algorithm in 56 years.
Type at least 2 characters to search...