Tag: reasoning – Andrey Lukyanenko

Apr 20, 2026

FIPO: Teaching LLMs Which Thoughts Actually Matter

FIPO - an RL algorithm that fixes one of the core limitations of RL for LLM reasoning - credit assignment. Instead of...

paperreview deeplearning llm rl

Apr 21, 2025

Paper Review: M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

My review of the paper M1 Towards Scalable Test-Time Compute with Mamba Reasoning Models

paperreview deeplearning rnn distillation

Jan 06, 2025

Paper Review: Training Large Language Models to Reason in a Continuous Latent Space

Coconut lets LLMs reason in latent space instead of generating text tokens, enabling breadth-first exploration of rea...

paperreview deeplearning nlp llm

Dec 09, 2024

Paper Review: Reverse Thinking Makes LLMs Stronger Reasoners

My review of the paper Reverse Thinking Makes LLMs Stronger Reasoners

paperreview deeplearning nlp llm