Paper Review: Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing
15 September 2025
My review of the paper Sharing is Caring Efficient LM Post-Training with Collective RL Experience Sharing
Data science, career and other topics
15 September 2025
My review of the paper Sharing is Caring Efficient LM Post-Training with Collective RL Experience Sharing
01 September 2025
My review of the paper Pref-GRPO Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
14 August 2025
My experience of searching for a job in 2024 as an MLE across the globe
04 August 2025
My review of the paper Group Sequence Policy Optimization
28 July 2025
My review of the paper Subliminal Learning Language models transmit behavioral traits via hidden signals in data